Re: [tcpPrague] Experimental dual-queue ECN

Michael Welzl <michawe@ifi.uio.no> Sat, 25 June 2016 22:44 UTC

Return-Path: <michawe@ifi.uio.no>
X-Original-To: tcpprague@ietfa.amsl.com
Delivered-To: tcpprague@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 48BF812B012 for <tcpprague@ietfa.amsl.com>; Sat, 25 Jun 2016 15:44:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.626
X-Spam-Level:
X-Spam-Status: No, score=-5.626 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-1.426] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PnZmn_GZAXFw for <tcpprague@ietfa.amsl.com>; Sat, 25 Jun 2016 15:44:03 -0700 (PDT)
Received: from mail-out5.uio.no (mail-out5.uio.no [IPv6:2001:700:100:10::17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2876812B007 for <tcpprague@ietf.org>; Sat, 25 Jun 2016 15:44:02 -0700 (PDT)
Received: from mail-mx4.uio.no ([129.240.10.45]) by mail-out5.uio.no with esmtp (Exim 4.80.1) (envelope-from <michawe@ifi.uio.no>) id 1bGwIw-0008K0-MJ; Sun, 26 Jun 2016 00:43:58 +0200
Received: from 3.134.189.109.customer.cdi.no ([109.189.134.3] helo=[192.168.0.100]) by mail-mx4.uio.no with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) user michawe (Exim 4.80) (envelope-from <michawe@ifi.uio.no>) id 1bGwIv-00043V-Qb; Sun, 26 Jun 2016 00:43:58 +0200
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (1.0)
From: Michael Welzl <michawe@ifi.uio.no>
X-Mailer: iPhone Mail (13F69)
In-Reply-To: <576E9344.2010909@erg.abdn.ac.uk>
Date: Sun, 26 Jun 2016 00:43:53 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <ED8A00A3-7FE4-4972-955B-67058B880CFD@ifi.uio.no>
References: <574F2A2D.9070407@bobbriscoe.net> <574F4F29.9040409@bobbriscoe.net> <20160601215312.GA25116@verdi> <0898e249-03dd-aff9-7179-03cc8642efea@erg.abdn.ac.uk> <5762567D.8010609@bobbriscoe.net> <3f8fa637-17b5-853b-b835-db486a2a69f6@erg.abdn.ac.uk> <CAKKJt-cjncm7zsfj3=7pqB-uSNTxMPfjPY=qpSNnDncVmy+enA@mail.gmail.com> <20160624170118.GA52708@verdi> <576D70CB.8060108@erg.abdn.ac.uk> <8D9E4035-23E9-4BAD-B689-BF82C54BC98F@ifi.uio.no> <20160625140803.GB52708@verdi> <576E9344.2010909@erg.abdn.ac.uk>
To: gorry@erg.abdn.ac.uk
X-UiO-SPF-Received:
X-UiO-Ratelimit-Test: rcpts/h 5 msgs/h 1 sum rcpts/h 6 sum msgs/h 1 total rcpts 43664 max rcpts/h 54 ratelimit 0
X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, MIME_QP_LONG_LINE=0.001, TVD_RCVD_IP=0.001, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO)
X-UiO-Scanned: 0EF162D0BE3D9EA0A84A74ADD18E341606A4D432
X-UiO-SPAM-Test: remote_host: 109.189.134.3 spam_score: -49 maxlevel 80 minaction 2 bait 0 mail/h: 1 total 1453 max/h 15 blacklist 0 greylist 0 ratelimit 0
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpprague/t1QBWu-QAcZ_c8tQ_av4zvL5jZ0>
Cc: Bob Briscoe <ietf@bobbriscoe.net>, John Leslie <john@jlc.net>, Spencer Dawkins <spencerdawkins.ietf@gmail.com>, TCP Prague List <tcpPrague@ietf.org>
Subject: Re: [tcpPrague] Experimental dual-queue ECN
X-BeenThere: tcpprague@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "To coordinate implementation and standardisation of TCP Prague across platforms. TCP Prague will be an evolution of DCTCP designed to live alongside other TCP variants and derivatives." <tcpprague.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpprague/>
List-Post: <mailto:tcpprague@ietf.org>
List-Help: <mailto:tcpprague-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Jun 2016 22:44:06 -0000

how nice to see, we all agree!

Sent from my iPhone

> On 25. jun. 2016, at 16.20, Gorry Fairhurst <gorry@erg.abdn.ac.uk> wrote:
> 
> Email is a wodnerful media... we seem to have a lot more agreement than I'd hoped.
> 
> (... and yes: If we have more than one simple PS update to 3168, and everyone agrees, I think these could form part of the same document).
> 
> Gorry
> 
>> On 25/06/2016 15:08, John Leslie wrote:
>>    I seem to have flunked Robert Heinlein's class in writing orders!
>> 
>>    Gorry succeeded in misunderstanding my point 1: "that reaction to ECN-CE
>> should be the 'same as' drop". :^)
>> 
>>    It was "obvious" to me that RFC 3168 now requires that, and that we
>> need to change so that is no longer required.
>> 
>>    Fortunately, it seems the three of us agree it needs to change. :^)
>> 
>> Michael Welzl<michawe@ifi.uio.no>  wrote:
>>> From: Michael Welzl<michawe@ifi.uio.no>
>>> ...
>>>> On 24. jun. 2016, at 19.41, Gorry Fairhurst<gorry@erg.abdn.ac.uk>  wrote:
>>>> ...
>>>> On 24/06/2016 18:01, John Leslie wrote:
>>>>> Spencer Dawkins at IETF<spencerdawkins.ietf@gmail.com>   wrote:
>>>>>> ...
>>>>>> Are you thinking that wouldn't work here?
>>>>> IMHO, such a path to  update RFC 3168 is impractical.
>>>>> ...
>>>>> IMHO, we need to break off a limited part of it for Experimental
>>>>> protocols:
>>>>> 
>>>>> 1. that reaction to ECN-CE should be the "same as" drop; and
>>>> I disagree of course - the CE-marking proposal has already been
>>>> discussed at the IETF - and I suggest no ECN is likely to be foundi
>>>> using RED - and ECN-marked RED is now anyway now deprecated.
>>>> Many modern AQM methods CE-mark on a shallow queue - and I think
>>>> we need to update the PS to reflect this.
>>    Fundamentally, I agree: but my concern was the minimum change needed
>> to 3168.
>> 
>>> +1 to Gorry  (unsurprisingly).
>>> 
>>> The ECN ???Experiment??? has hardly happened, so on what basis de
>>> we say that a reaction that is the ???same as??? drop is safe?
>>    IMHO, that is what 3168 meant to say. I've never agreed with it.
>> 
>>> If we just take operational experience, Cubic has first used a backoff
>>> (multiplication) factor of 0.8, then 0.7, deployed in Linux and widely
>>> used in the Internet.
>>> This isn???t even limited to an ECN signal, which is likely to be
>>> produced by an AQM mechanism, and hence much more likely to indicate
>>> a shallow queue than loss.
>>    I agree.
>> 
>>> So: I think we can now assert that the Internet won???t melt down if
>>> we'd back off using a different multiplication factor than 0.5.
>>    I agree that would be good background material for a PS RFC relaxing
>> that rule of 3168.
>> 
>>    (But I don't think we need to limit the difference to applying a
>> differentt factor: that is a really obvious way, and it has effectively
>> been tested in actual use; but it is not the _only_ way.
>> 
>>> Using such a larger factor *only* in response to ECN is even more
>>> conservative, so even safer.
>>    Exactly!
>> 
>>> Adding to this, what exactly is the logic that makes ???react to
>>> marking the same way as you would react to drop??? particularly safe?
>>    The argument, as best I recall, was that otherwise one flow would
>> starve the other. But there are other ways to treat that problem!
>> 
>>> I can only assume that this assumes the same behavior in the network
>>> for ECN-marking and dropping, and so, if we keep as much as possible
>>> similar, this would be a safe way to go.
>>    I don't see any reason to "keep as much as possible similay" if
>> it's for an Experimental RFC. We should aim for _significant_ improvement.
>> 
>>> Reality is different. Please see, for example Figure 13 in this pdf:
>>> https://www.duo.uio.no/bitstream/handle/10852/37381/khademi-AQM_Kids_TR434.pdf
>>    I shall get around to reading this today.
>> 
>>> ...
>>> In conclusion, I struggle to see the big reason why an ???exactly like
>>> loss??? backoff is standard behavior and experimenting with other values
>>> should be prohibited.
>>    We three agree it shouldn't.
>> 
>>    But...
>> 
>>    Implementors should be able to read RFC3168 and its metadate and
>> understand that their ten-year-old code may no longer be compliant.
>> 
>>    It's about a warning-label!
>> 
>>> A new particular value may constitute an experiment, but the
>>> ???equal to loss??? limitation simply isn???t a good thing and should
>>> be removed - this removal isn???t an experiment, it's a bugfix.
>>    Exactly!
>> 
>>> ...
>>>> I also don't see a specific experiment that is needed here  - what
>>>> would be needed to test for safety in deployment? I *think*
>>>> particular update can be taken directly to PS.
>>> +1
>> 
>>    (I believe we could treat two issues in the same RFC, in order to
>> make implementors' jab a little easier. YMMV...)
>> 
>>> ...
>> --
>> John Leslie<john@jlc.net>
>