Re: [re-ECN] TCP's "Dynamic Range"

Matt Mathis <matt.mathis@gmail.com> Wed, 28 October 2009 01:22 UTC

Return-Path: <matt.mathis@gmail.com>
X-Original-To: re-ecn@core3.amsl.com
Delivered-To: re-ecn@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4A42B3A6860 for <re-ecn@core3.amsl.com>; Tue, 27 Oct 2009 18:22:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.357
X-Spam-Level:
X-Spam-Status: No, score=-2.357 tagged_above=-999 required=5 tests=[AWL=0.242, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8Z5DMJmpO4VD for <re-ecn@core3.amsl.com>; Tue, 27 Oct 2009 18:22:46 -0700 (PDT)
Received: from mail-bw0-f223.google.com (mail-bw0-f223.google.com [209.85.218.223]) by core3.amsl.com (Postfix) with ESMTP id 1EB0128C119 for <re-ecn@ietf.org>; Tue, 27 Oct 2009 18:22:45 -0700 (PDT)
Received: by bwz23 with SMTP id 23so409125bwz.29 for <re-ecn@ietf.org>; Tue, 27 Oct 2009 18:23:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=0rkFGjdFb86NFd+3+0raslAHzKRTB7sZkqsn+sUi/b4=; b=e66hNHe2CZuGZjxbSOOezrHnTm0FxySFg8s4fCrfET9QKOQnmmzRbij9TBZWAI0aeO lIo+DIRnhPFOBmxHn18vVXOGQqqoFo6t9fMsU+KyT8qSAIHW1KlU4hlN+YeA4UJAyliL uobPtteHAEEJvAmpm26jo85vZEUohKsQH6vTw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=iDYT1vdoFp/goViEtJjAFDhHgdjG0sUZUeMQ2DKNKLLlOvdhqbUILHm/BER3RexiYC sKKb6P0jqtIqhj87dcWx6x+u4r3HEtxB56DqHcCIrMJyT/019p918yJpNY2cdXmtCkKQ mg81p4sQBpkvqqO0ktDnz3nQ98lEX0rJNLLgc=
MIME-Version: 1.0
Received: by 10.204.24.69 with SMTP id u5mr5004174bkb.1.1256692979956; Tue, 27 Oct 2009 18:22:59 -0700 (PDT)
In-Reply-To: <4AE6E99B.6050907@thinkingcat.com>
References: <4AE26E9B.8060205@thinkingcat.com> <200910242327.n9ONRbZt023456@bagheera.jungle.bt.co.uk> <4AE4CBDB.4030806@thinkingcat.com> <200910261228.n9QCSCp0030099@bagheera.jungle.bt.co.uk> <20091026133640.GA62345@verdi> <200910262116.n9QLGTBE010898@bagheera.jungle.bt.co.uk> <4AE6E99B.6050907@thinkingcat.com>
Date: Tue, 27 Oct 2009 21:22:56 -0400
Message-ID: <fc0ff13d0910271822n7e0ec0ceq575b9121678539e6@mail.gmail.com>
From: Matt Mathis <matt.mathis@gmail.com>
To: Leslie Daigle <leslie@thinkingcat.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: re-ecn@ietf.org
Subject: Re: [re-ECN] TCP's "Dynamic Range"
X-BeenThere: re-ecn@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: re-inserted explicit congestion notification <re-ecn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>, <mailto:re-ecn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/re-ecn>
List-Post: <mailto:re-ecn@ietf.org>
List-Help: <mailto:re-ecn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>, <mailto:re-ecn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Oct 2009 01:22:48 -0000

So I have two issues that I would like to see addressed, I think as
part of existing agenda items, but I want to be sure that they are not
ruled out of order by the anti-rat hole police.

First, I really think we need to broaden Re-ECN to RE-Feedback, even
under the still broader umbrella congestion exposure.  RE-ECN
presupposes the deployment of ECN, which has not deployed in 10 years.
 I believe that loss based RE-Feedback can provide much of the
benefits of RE-ECN, but with a far easier deployment strategy.
Furthermore unifying the semantics of the two approaches will make
both stronger.

My point is really that we need to view loss based and ECN based RE feedback
as equals, both likely to be present in steady state, and not one
merely as a deployment strategy for the other.  Our terminology should
reflect this.

Second, we need a task to actively promote ECN deployment.  Many
people have have worked on this, and it has been all but vetoed by a
couple of persistent problems.   Although shepherding deployment is
not a traditional IETF activity, it really think it is the right venue
to bring the right people to bear on the problem.   We need to divide
the problem, such that improved technology in a couple of specific
areas will help the broader community to help us to eliminate the tiny
persistent problems that have thwarted ECN deployment.   I would
really like to see the technological part done under ConEX.

(The required technology is to make OS releases that are targeted for
technology users and beta testes run ECN black hole detection with
automatic diagnosis and reporting.  As we get more experience with
these algorithms and they get enabled for progressively wider pools of
users we will gradually eliminate the bugs in the infrastructure).

Thanks,
--MM--
-------------------------------------------
Matt Mathis      http://staff.psc.edu/mathis
Work:412.268.3319   Home/Cell:412.654.7529
-------------------------------------------
Evil is defined by mortals who think they know
"The Truth" and use force to apply it to others.



On Tue, Oct 27, 2009 at 8:37 AM, Leslie Daigle <leslie@thinkingcat.com> wrote:
>
> I like the principles a lot, too.
>
> I'm thinking that could be an excellent jumping off point for "constraints",
> in the agenda.   Phil's currently slated to do that section.  Are you (John)
> going to be in Hiroshima?  And/or, can you work with Phil to integrate these
> points into the discussion-leading material?
>
> Leslie.
>
> Bob Briscoe wrote:
>>
>> John,
>>
>> I really like your list of principles. Where in the BoF do you suggest we
>> put it?
>>
>> And I agree with everything you've said in this email, with just a couple
>> of inline comments...
>>
>> At 13:36 26/10/2009, John Leslie wrote:
>>>
>>> Bob Briscoe <rbriscoe@jungle.bt.co.uk> wrote:
>>> > At 22:06 25/10/2009, Leslie Daigle wrote:
>>> >>Bob Briscoe wrote:
>>> >>>
>>> >>> I've been thinking... We should add an item to the many purposes
>>> >>> list:
>>> >>>
>>> >>> - evolution path beyond TCP (running out of dynamic range)
>>> >>
>>> >> Which is pretty cool, but on the bof agenda might lead to some
>>> >> ratholing on whether we're just bashing TCP, no?
>>> >
>>> > [BB] Not at all. Everyone in the transport area knows TCP has run out
>>> > of dynamic range. It's a well-known problem.
>>>
>>>   I've been thinking somewhat along this line, though I wasn't using
>>> the term "dynamic range", but rather considering signal-to-noise in
>>> our feedback function.
>>>
>>>   Beyond any question, TCP has been successful at curing "congestion
>>> collapse" -- which is, after all, what it aimed to do.
>>>
>>>   But packet loss, as a feedback signal, is frankly terribly unsuited
>>> to fine-tuning how to share bandwidth at the bottlenecks. It's not,
>>> IMHO, "bashing" TCP to point out this difference.
>>>
>>>   Network operators, at best, can only estimate packet loss, not
>>> measure it meaningfully, and even if they could measure it with 100%
>>> accuracy, they'd have no hope of relating packet loss to the backoff
>>> it signals.
>>>
>>>   That's because the multiplicative decrease it signals is based
>>> on end-to-end flows at a higher layer. We're looking at five or more
>>> orders of magnitude difference in the amount of backoff signaled by
>>> a packet lost. For a network operator, the signal is quite lost in
>>> the noise!
>>>
>>>   As we review the history of TCP, we notice that attempts to move
>>> away from packet loss have failed to dependably avoid congestion
>>> collapse: thus implementations tend to depend entirely on packet
>>> loss to signal backoff; and other congestion-notification tends to
>>> be ignored or even suppressed.
>>>
>>>   (BTW, that's an issue we need to be prepared to discuss: how can
>>> re-ecn operate when ECN marks are suppressed? Even though most of
>>> the suppression history concerns ICMP, there will be folks who think
>>> ECN will suffer similar suppression.)
>>
>> As this sentence is in the passive, I assume you mean suppression by the
>> transport or some other link than the congested one (not suppression by the
>> congested link itself).
>>
>> That's why we brought re-ECN to the IETF - because we had solved that
>> problem. The draft-briscoe-tsvwg-re-ecn-tcp-motivation-01.txt explains the
>> mechanisms that can be built over re-ECN to detect & prevent suppression.
>>
>> I've actually got a PhD thesis of proofs on this now, with pseudocode of
>> the mechanisms, simulations etc etc, but I haven't got round to asking my
>> company for permission to publish (mainly because they are more focused on
>> sacking people than authorising publications at the mo). I promise it will
>> be published soon.
>>
>>
>>>   We should recognize that multiplicative-decrease on packet loss
>>> is a proven winner for avoiding congestion collapse, and concentrate
>>> on what's a better signal for management of bottleneck bandwidth.
>>> I propose a few principles:
>>>
>>> 1) The signal needs to be visible to the network manager managing
>>>   the bottleneck;
>>>
>>> 2) The signal should be distinct enough to be the basis for cost
>>>   allocation for upgrading the bottleneck;
>>>
>>> 3) The signal should be visible to end-systems, giving a decent
>>>   measure of how much to backoff their sending rate;
>>>
>>> 4) The signal should enable "backpressure" to allow network managers
>>>   to avoid forwarding too many packets towards the bottleneck;
>>>
>>> 5) The signal should not involve packet loss.
>>>
>>>   I believe that a properly-designed signalling system can work
>>> for at least eight or nine orders of magnitude of sender bandwidth.
>>> To be complete, a proposal should probably get into how many bits
>>> per signal, but I'm personally convinced that re-ecn can work
>>> beyond five orders of magnitude.
>>
>> Have you been following Matt Mathis's work on Relentless TCP? And
>> generally on TCP algos with window proportional to 1/p, rather than
>> 1/sqrt(p) like current TCP. The idea is these maintain the same number of
>> loss or ECN signals per window however fast you go.
>>
>> Is there some reason for choosing 8 or 9 orders of magnitude? I would have
>> thought 1/p would scale indefinitely, but you may be thinking of other
>> factors I've missed.
>>
>> Scaling was also one of the main motivations for Kelly's primal algo. And
>> it was one of my motivations for introducing re-ECN so we could shift from
>> the TCP-friendly (1/sqrt(p)) track painlessly onto a scalable 1/p track
>> without worrying about flow fairness.
>>
>>
>> Bob
>>
>>
>>>   So, avoiding the bits-per-signal question, are the five principles
>>> above sufficient? Are they necessary? Can we come up with a good
>>> presentation of such principles for the BoF?
>>>
>>> --
>>> John Leslie <john@jlc.net>
>>
>> ________________________________________________________________
>> Bob Briscoe,                                BT Innovate & Design
>
> --
>
> -------------------------------------------------------------------
> "Reality:
>     Yours to discover."
>                                -- ThinkingCat
> Leslie Daigle
> leslie@thinkingcat.com
> -------------------------------------------------------------------
> _______________________________________________
> re-ECN mailing list
> re-ECN@ietf.org
> https://www.ietf.org/mailman/listinfo/re-ecn
>