Re: [tcpPrague] [aqm] L4S status update

Bob Briscoe <ietf@bobbriscoe.net> Tue, 22 November 2016 19:09 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tcpprague@ietfa.amsl.com
Delivered-To: tcpprague@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 20EA7129E84; Tue, 22 Nov 2016 11:09:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1vpxT9PSdtGh; Tue, 22 Nov 2016 11:09:42 -0800 (PST)
Received: from server.dnsblock1.com (server.dnsblock1.com [85.13.236.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0B91B129B4C; Tue, 22 Nov 2016 11:09:42 -0800 (PST)
Received: from [31.185.252.113] (port=55476 helo=[192.168.0.3]) by server.dnsblock1.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.87) (envelope-from <ietf@bobbriscoe.net>) id 1c9GRn-0001V6-Qt; Tue, 22 Nov 2016 19:09:40 +0000
To: "Bless, Roland (TM)" <roland.bless@kit.edu>
References: <be67928d-e1f7-2495-147d-1d42d6783cc8@bobbriscoe.net> <f6b89407-14d8-b532-b793-7490cb5a2117@kit.edu>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <f16c9830-f97a-64e0-76e6-66f146576616@bobbriscoe.net>
Date: Tue, 22 Nov 2016 19:09:39 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0
MIME-Version: 1.0
In-Reply-To: <f6b89407-14d8-b532-b793-7490cb5a2117@kit.edu>
Content-Type: multipart/alternative; boundary="------------AD7E6DDACF9AC164C6863BEB"
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server.dnsblock1.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: server.dnsblock1.com: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: server.dnsblock1.com: in@bobbriscoe.net
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpprague/kaEzqDerzfCXJrNGHTCt_auUmy8>
Cc: tcpm IETF list <tcpm@ietf.org>, AQM IETF list <aqm@ietf.org>, tsvwg IETF list <tsvwg@ietf.org>, TCP Prague List <tcpPrague@ietf.org>
Subject: Re: [tcpPrague] [aqm] L4S status update
X-BeenThere: tcpprague@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "To coordinate implementation and standardisation of TCP Prague across platforms. TCP Prague will be an evolution of DCTCP designed to live alongside other TCP variants and derivatives." <tcpprague.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpprague/>
List-Post: <mailto:tcpprague@ietf.org>
List-Help: <mailto:tcpprague-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Nov 2016 19:09:46 -0000

Roland,

I share your concern about cc-specific AQMs. But that is not a good 
characterization of what we're doing.

On the current Internet, everything is meant to be somewhat "friendly" 
to the original TCP cc (now spec'd in RFC5681). All sorts of cc's work 
alongside that, with slightly different "fairness" properties, and only 
one AQM is needed to cover them all. Nonetheless, Reno is the "lamest", 
so everyone has to try to "Do (not much) Harm" to the lamest.

*Is any AQM CC-neutral?**
*Note rule 5 <https://tools.ietf.org/html/rfc7567#section-4.5> in the 
AQM Guidelines [RFC7567]
       "AQM algorithms SHOULD NOT interpret specific transport protocol 
behaviors."
In general, the advice in that section is sound, but I don't think we 
realized at that time just how subtle this issue is.

Since then, I discovered that the autotuning parameter table in the PIE 
algorithm is designed very precisely around the 1/sqrt(p) rule of Reno 
(see Fig 5 in the PI^2 paper <http://www.bobbriscoe.net/pubs.html#PI2>). 
Similarly, the sqrt control law in Codel claims to be dependent on Reno 
{Note 1}.

The point is that these AQMs still work fine with Cubic, Compound, 
Westwood, etc, because all these ccs were designed to interwork with 
Reno. {Note 2}

The idea of L4S (and specifically the DualQ Coupled AQM and the L4S ID 
spec) is to enable a shift to a completely different "norm", but still 
coexist with all the 'Classic' cc's that coexisted around the old 
"norm". The new norm is intended to be just as fuzzy as the old norm 
{Note 3}. The idea is two fuzzy clouds of congestion controls, around an 
old and a new norm that are related together.

*BBR**
*I believe BBR attempts to be 'friendly' to loss-based flows when 
competing in the same queue. But it's still research, and we don't yet 
know how good it is at that in all scenarios, although we do have code 
to test now. Given BBR currently sets Not-ECT, it would classify itself 
into the Classic queue of a DualQ AQM, and if it coexists with Reno it 
/should/ coexist with L4S traffic in the other queue. See Koen's recent 
posting 
<https://www.ietf.org/mail-archive/web/tsvwg/current/msg14771.html> 
about this.

There would be nothing to stop someone designing a variant of BBR that 
coexisted in the L4S queue with Scalable CCs like DCTCP (the point being 
that if the bottleneck was not DualQ it would keep delay low and if if 
the bottleneck was DualQ it would benefit even more from the lower 
queuing delay there). However, it would have to be a bit more careful 
about its whole round trip of queue probing, to avoid increasing the 
delay in the L4S queue. You'll see that I suggested to Neil Cardwell 
that they consider probing with a few packets rather than a whole 
window, e.g. the chirping 
<http://www.bobbriscoe.net/pubs.html#chirp_impl> technique that Mirja 
and I looked into back in 2010 was designed to find the same knee 
between rate increase and delay increase, with far fewer packets. I 
thought of a better way of using chirping a few weeks ago, so I will be 
returning to that too.

*Specs**
*There is no statement that all L4S cc's MUST adhere to a 1/p rule. The 
L4S ID draft says:
   "The inverse proportionality requirement above is worded
    as a 'SHOULD' rather than a 'MUST' to allow reasonable flexibility
    when defining these specifications."

I hope that 'SHOULD' is fuzzy enough - I suspect adding more words would 
make it less fuzzy. But I would welcome wording to make it even more 
fuzzy if you would like to engage in wordsmithing.

Nonetheless, we are trying to steer a path between a rock and a hard 
place. Because, to shift to a much calmer waters beyond the rocks, we 
have to define some number to relate L4S to Classic. I am wary because 
when ECN was specified, there were attempts to define ECN as different 
to drop. However, ECN originally ended up "the same as drop" because 
no-one could muster enough backing behind any particular number to 
relate the two, so the number '1' won by default (ie. ECN = 1 * drop^1 ).


Bob

{Note 1} I have never got a good answer to my questions on aqm@ietf as 
to why a sqrt that controls the shrinkage of the spacing between dropped 
packets has something to do with the steady state law of Reno, 
particularly because the law leads to linear growth in p over time.

{Note 2} Actually, I don't believe PIE would work that well with Cubic 
at v high BDP, once it was far from Reno-friendly, but that's only 
intuition from the stability analysis, not from actual testing.

{Note 3} Indeed, at the moment, when DCTCP is on its own in the L4S 
queue of the DualQ AQM as coded now, it hits up against a step 
threshold, which makes it behave as 1/p^2, not 1/p. For now, that's just 
because we didn't want to change too much about DCTCP at one time. But 
it's also got some nice properties. This will all need to be discussed 
as the DualQ AQM is specified more deeply.


On 21/11/16 14:27, Bless, Roland (TM) wrote:
> Hi Bob and all,
>
> see below.
>
> Am 01.11.2016 um 01:02 schrieb Bob Briscoe:
>> A few people have been working away to specify and document all the
>> aspects of the new Low Latency, Low Loss, Scalable throughput (L4S)
>> service, which held a successful BoF in Berlin. As the decision was to
>> try to work across multiple WGs, I thought it would be useful to give ...
> Thanks for putting this together.
>
>>    * Dual Queue Coupled AQM
>>        o With Curvy RED for Linux (access available shortly)
>>        o With PI2 for Linux <https://github.com/olgabo/dualpi2> [*UPDATED*]
> I'll repeat my concerns that I already expressed at the L4S BOF in Berlin:
>
> While I agree that we probably need to separate low-delay congestion
> control schemes from traditional "queue-filling" congestion schemes,
> I strongly suggest to avoid putting a congestion control-specific
> coupling scheme into the network (a classic case for applying the
> "end-to-end arguments in system design").
> The current Dual queue coupled AQM proposal has got a coupling based on
> a congestion control specific dropping law p_C=(p_L/2)². So if
> congestion control schemes change then this coupling needs to be
> adapted. For example, the currently proposed scheme may fail if that
> vast majority of TCP traffic is using BBR other some other forthcoming
> CC scheme instead of Cubic, Reno, Compound etc. The same applies to
> draft-briscoe-tsvwg-ecn-l4s-id, section 2.5, where the dropping
> likelihood is defined.
>
> Regards,
>   Roland
>

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/