Re: [tcpPrague] Experimental dual-queue ECN

John Leslie <john@jlc.net> Sat, 25 June 2016 14:08 UTC

Return-Path: <john@jlc.net>
X-Original-To: tcpprague@ietfa.amsl.com
Delivered-To: tcpprague@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 24A5612D0D1 for <tcpprague@ietfa.amsl.com>; Sat, 25 Jun 2016 07:08:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.026
X-Spam-Level:
X-Spam-Status: No, score=-4.026 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-1.426] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d4nBa7Kwp9FV for <tcpprague@ietfa.amsl.com>; Sat, 25 Jun 2016 07:08:08 -0700 (PDT)
Received: from mailhost.jlc.net (mailhost.jlc.net [199.201.159.4]) by ietfa.amsl.com (Postfix) with ESMTP id A0A1E12B04C for <tcpPrague@ietf.org>; Sat, 25 Jun 2016 07:08:06 -0700 (PDT)
Received: by mailhost.jlc.net (Postfix, from userid 104) id 665D7C9417; Sat, 25 Jun 2016 10:08:03 -0400 (EDT)
Date: Sat, 25 Jun 2016 10:08:03 -0400
From: John Leslie <john@jlc.net>
To: Michael Welzl <michawe@ifi.uio.no>
Message-ID: <20160625140803.GB52708@verdi>
References: <574F2A2D.9070407@bobbriscoe.net> <574F4F29.9040409@bobbriscoe.net> <20160601215312.GA25116@verdi> <0898e249-03dd-aff9-7179-03cc8642efea@erg.abdn.ac.uk> <5762567D.8010609@bobbriscoe.net> <3f8fa637-17b5-853b-b835-db486a2a69f6@erg.abdn.ac.uk> <CAKKJt-cjncm7zsfj3=7pqB-uSNTxMPfjPY=qpSNnDncVmy+enA@mail.gmail.com> <20160624170118.GA52708@verdi> <576D70CB.8060108@erg.abdn.ac.uk> <8D9E4035-23E9-4BAD-B689-BF82C54BC98F@ifi.uio.no>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <8D9E4035-23E9-4BAD-B689-BF82C54BC98F@ifi.uio.no>
User-Agent: Mutt/1.4.1i
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpprague/zG-yC_P8EOH4oAb7ufFgtssB18E>
Cc: "<gorry@erg.abdn.ac.uk> Fairhurst" <gorry@erg.abdn.ac.uk>, Bob Briscoe <ietf@bobbriscoe.net>, TCP Prague List <tcpPrague@ietf.org>, Spencer Dawkins <spencerdawkins.ietf@gmail.com>, John Leslie <john@jlc.net>
Subject: Re: [tcpPrague] Experimental dual-queue ECN
X-BeenThere: tcpprague@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "To coordinate implementation and standardisation of TCP Prague across platforms. TCP Prague will be an evolution of DCTCP designed to live alongside other TCP variants and derivatives." <tcpprague.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpprague/>
List-Post: <mailto:tcpprague@ietf.org>
List-Help: <mailto:tcpprague-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Jun 2016 14:08:10 -0000

   I seem to have flunked Robert Heinlein's class in writing orders!

   Gorry succeeded in misunderstanding my point 1: "that reaction to ECN-CE
should be the 'same as' drop". :^)

   It was "obvious" to me that RFC 3168 now requires that, and that we
need to change so that is no longer required.

   Fortunately, it seems the three of us agree it needs to change. :^)

Michael Welzl <michawe@ifi.uio.no> wrote:
> From: Michael Welzl <michawe@ifi.uio.no>
>... 
>> On 24. jun. 2016, at 19.41, Gorry Fairhurst <gorry@erg.abdn.ac.uk> wrote:
>>... 
>> On 24/06/2016 18:01, John Leslie wrote:
>>> Spencer Dawkins at IETF<spencerdawkins.ietf@gmail.com>  wrote:
>>>> ...
>>>> Are you thinking that wouldn't work here?
>>>
>>> IMHO, such a path to  update RFC 3168 is impractical.
>>>... 
>>> IMHO, we need to break off a limited part of it for Experimental
>>> protocols:
>>>
>>> 1. that reaction to ECN-CE should be the "same as" drop; and

>> I disagree of course - the CE-marking proposal has already been
>> discussed at the IETF - and I suggest no ECN is likely to be foundi
>> using RED - and ECN-marked RED is now anyway now deprecated.
>> Many modern AQM methods CE-mark on a shallow queue - and I think
>> we need to update the PS to reflect this.

   Fundamentally, I agree: but my concern was the minimum change needed
to 3168.

> +1 to Gorry  (unsurprisingly).
> 
> The ECN ???Experiment??? has hardly happened, so on what basis de
> we say that a reaction that is the ???same as??? drop is safe?

   IMHO, that is what 3168 meant to say. I've never agreed with it.

> If we just take operational experience, Cubic has first used a backoff
> (multiplication) factor of 0.8, then 0.7, deployed in Linux and widely
> used in the Internet.
> This isn???t even limited to an ECN signal, which is likely to be
> produced by an AQM mechanism, and hence much more likely to indicate
> a shallow queue than loss.

   I agree.

> So: I think we can now assert that the Internet won???t melt down if
> we'd back off using a different multiplication factor than 0.5.

   I agree that would be good background material for a PS RFC relaxing
that rule of 3168.

   (But I don't think we need to limit the difference to applying a
differentt factor: that is a really obvious way, and it has effectively
been tested in actual use; but it is not the _only_ way.

> Using such a larger factor *only* in response to ECN is even more
> conservative, so even safer.

   Exactly!

> Adding to this, what exactly is the logic that makes ???react to
> marking the same way as you would react to drop??? particularly safe?

   The argument, as best I recall, was that otherwise one flow would
starve the other. But there are other ways to treat that problem!

> I can only assume that this assumes the same behavior in the network
> for ECN-marking and dropping, and so, if we keep as much as possible
> similar, this would be a safe way to go.

   I don't see any reason to "keep as much as possible similay" if
it's for an Experimental RFC. We should aim for _significant_ improvement.

> Reality is different. Please see, for example Figure 13 in this pdf:
> https://www.duo.uio.no/bitstream/handle/10852/37381/khademi-AQM_Kids_TR434.pdf

   I shall get around to reading this today.

>...
> In conclusion, I struggle to see the big reason why an ???exactly like
> loss??? backoff is standard behavior and experimenting with other values
> should be prohibited.

   We three agree it shouldn't.

   But...

   Implementors should be able to read RFC3168 and its metadate and
understand that their ten-year-old code may no longer be compliant.

   It's about a warning-label!

> A new particular value may constitute an experiment, but the
> ???equal to loss??? limitation simply isn???t a good thing and should
> be removed - this removal isn???t an experiment, it's a bugfix.

   Exactly!

>...
>> I also don't see a specific experiment that is needed here  - what
>> would be needed to test for safety in deployment? I *think*
>> particular update can be taken directly to PS.
> 
> +1
 
   (I believe we could treat two issues in the same RFC, in order to
make implementors' jab a little easier. YMMV...)

>... 

--
John Leslie <john@jlc.net>