Re: [aqm] ECT(1)

Bob Briscoe <research@bobbriscoe.net> Thu, 01 October 2015 10:47 UTC

Return-Path: <research@bobbriscoe.net>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A999E1A1BD1 for <aqm@ietfa.amsl.com>; Thu, 1 Oct 2015 03:47:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3TZk0_BrgWSN for <aqm@ietfa.amsl.com>; Thu, 1 Oct 2015 03:47:08 -0700 (PDT)
Received: from server.dnsblock1.com (server.dnsblock1.com [85.13.236.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EB2951A1BCD for <aqm@ietf.org>; Thu, 1 Oct 2015 03:47:06 -0700 (PDT)
Received: from 8.37.199.146.dyn.plus.net ([146.199.37.8]:47036 helo=[192.168.0.16]) by server.dnsblock1.com with esmtpsa (TLSv1.2:DHE-RSA-AES128-SHA:128) (Exim 4.85) (envelope-from <research@bobbriscoe.net>) id 1ZhbO9-0002ff-Cc; Thu, 01 Oct 2015 11:47:03 +0100
To: Andrew Mcgregor <andrewmcgr@google.com>, Mikael Abrahamsson <swmike@swm.pp.se>
References: <ba3b6f6b4d3d453d887c451fbca412fa@hioexcmbx05-prd.hq.netapp.com> <CAA93jw5WrT0Azcew_gic5H-tJtBo62m-f4fBB0=qQp01uf3VuQ@mail.gmail.com> <8a1ed5a975d44a7bad88dc573971ded5@hioexcmbx05-prd.hq.netapp.com> <20150728145036.GK96964@verdi> <55BFF7EC.1010608@bobbriscoe.net> <alpine.DEB.2.02.1508061242450.11810@uplift.swm.pp.se> <CAPRuP3mFDprONVDpuidymTcBqX-9SyoF1FAddisC3Pe43nXAsQ@mail.gmail.com>
From: Bob Briscoe <research@bobbriscoe.net>
Message-ID: <560D0F23.9080305@bobbriscoe.net>
Date: Thu, 01 Oct 2015 11:46:59 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <CAPRuP3mFDprONVDpuidymTcBqX-9SyoF1FAddisC3Pe43nXAsQ@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------050208050301010103010204"
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server.dnsblock1.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: server.dnsblock1.com: authenticated_id: in@bobbriscoe.net
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/05LMuurcsk3IrOamm0WaZ_KcOd8>
Cc: "aqm@ietf.org" <aqm@ietf.org>
Subject: Re: [aqm] ECT(1)
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Oct 2015 10:47:12 -0000

Andrew,

DualQ might be a red herring for the special case you are thinking of 
within the Googleplex. But only a company with a brain the size of 
Google could countenance building bandwidth enforcer (BwE), in the 
belief that knowledge can be collected of the relative priorities, QoS 
requirements, bandwidth requirements, timing and location of all its 
applications. Then it might make sense to collect demand from all hosts 
up a control hierarchy and drive rate-limiting policies back down the 
hierarchy to every host which then sets Diffserv markings for all its 
different apps which determine routing and scheduling against a 
hierarchically organised master-plan.

But for the rest of us surely it's BwE that is the red herring - I mean 
us outside Google's internal networks, on the Internet, which consists 
of independent actors with no master Gosplan coordinating us all.

For most of the billions on the Internet who are not on large campus 
networks, geography and economics determines that the access network is 
the bottleneck - the part that is most spread-out physically and 
therefore least able to advantage from economies of scale (aggregation).

For instance, to demonstrate Coupled DualQ we chose the downstream queue 
in a broadband network gateway (BNG) - each of these queues is shaped to 
the rate of each residential broadband line. It would also be applicable 
for the upstream queue in each residential gateway. And it would also be 
applicable for the equivalent queue entering the access network from 
each end in other access network technologies: the HFC-node and the 
cable modem in cable, or the RNC and the user-equipment in cellular.

In these logical links, traffic is only for one user or one small site 
(e.g. a home) so it is very sparse. Diffserv is for aggregated traffic - 
Diffserv is useless if all traffic to and from a home at any one time is 
latency sensitive (e.g. Web, VoIP, and interactive video).

The simple argument for why DualQ:
a) Today's 'classic' TCPs are the problem in these sparse access links, 
the drastic saw-teeth vary the queue {Note 1}
b) Even with a good replacement TCP with small sawteeth (like DCTCP), it 
won't be able to keep queuing delay low if there might be 'classic' TCP 
traffic in the same queue, which implies 2 queues minimum.
c) I don't care how we do that, but the coupling was a neat way to do 
that without the network operator making /any/ capacity management 
judgements on behalf of the users.

An Internet service that gives /all/ traffic ultra-low queueing delay 
and near-zero congestion loss would be extremely valuable. Not having to 
ask permission for QoS is easy for both customer and network operator. 
Capacity sharing is then orthogonal to that.

Just because we show that TCP Cubic and DCTCP flows will get the same 
rates within the user's own line, that doesn't stop the user's 
applications using more or less aggressive congestion controls, or none 
at all, or multiple flows - all the usual ways of varying capacity 
shares still apply when the network operator doesn't enforce anything.


Actually, even if you do capacity sharing with BwE, if some hosts are 
using a 'classic' TCP, they will still cause varying queuing delay for 
other traffic. However, in your WAN scenario you might have sufficient 
aggregation for the resulting queuing delay not to be a problem.



Bob

{Note 1} I expected the slow-start overshoot would also be a problem, 
but (so far) just reducing the congestion avoidance sawteeth seems to 
remove virutally all the queuing delay - probably because we haven't 
done tests with sparse but bursty workloads yet. Anyway, we've got ideas 
on how to reduce the slow-start overshoot once ECN marking is more frequent.



On 07/08/15 01:51, Andrew Mcgregor wrote:
> I believe getting a DSCP deployed for this is a non-starter; that 
> space is a complete mess, and if we tie this proposal to cleaning up 
> that mess we'll get nowhere.  The evil bit doesn't fly either, for a 
> lot of reasons.
>
> That leaves us with ECT(1).
>
> So, how bad is that?  Well, not very.  In places which are ECN-enabled 
> but not dual-queue, DCTCP or another 1/p wrt ECN marking transport 
> will still respond to loss; sure, we'll drive the network into a lossy 
> regime, but those parts of the network are guaranteed to be there 
> anyway due to the presence of TCP.  Provided the loss response is 
> still sane, that will equilibrate out and we will end up with a 
> reasonable outcome. No, it will not be fair, but in the real world TCP 
> never is since the equilibrium takes hours to establish while 
> essentially no flows last that long; in real world practice, the few 
> flows that do last that long need special protection because they are 
> so fragile (BGP, I'm looking at you), or else nobody really cares how 
> long they take to complete.  TCP does not share capacity in any 
> reasonable manner, it is a circuit breaker for avoiding congestion 
> collapse, and nobody is proposing running completely oblivious traffic 
> on the internet.  DCTCP still has enough congestion avoidance.
>
> The dual-queue solution is not the only way to deploy either; it's a 
> nice solution, if one actually wants to achieve capacity sharing by 
> router feedback, but there are other ways to do the capacity sharing 
> (for example 
> http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p1.pdf). In an 
> admission controlled environment, separating the queues is not 
> necessary for sane coexistence.  Further I'm sure there are other 
> single-queue solutions, involving AQM control systems, although I 
> don't have a proposal right now (nor do I think one is necessary).
>
> I do think that assuming dual queue to be necessary for deployment is 
> a red herring.
>
> On 7 August 2015 at 06:38, Mikael Abrahamsson <swmike@swm.pp.se 
> <mailto:swmike@swm.pp.se>> wrote:
>
>     On Tue, 4 Aug 2015, Bob Briscoe wrote:
>
>         *Combining ECT(0) and CE with a globally assigned DSCP solely
>         during initial deployment of L4S seems the least worst choice.
>
>
>     Having the same bits in the header mean different things in
>     combination with DSCP seems like a really hard to get deployed
>     Internet-wide.
>
>     ECN is just now gaining traction and seems like it might actually
>     see real deployment. Repurposing those bits just now would most
>     likely just cause confusion.
>
>     I started using ECN when it first appeared in the Linux kernel
>     around 2001 or whenever it was. I had to immediately turn it off
>     because some firewalls dropped those packets. Now almost 15 years
>     later after this sitting in the operating systems for at least 10
>     years, we're now getting to a point where we're ready to start
>     turning it on widely because things do not break when it's turned on.
>
>     So whatever you come up with now that requires host stack changes,
>     expect 5-10 years at least until it can be deployed. This means
>     you have to be really sure this is what you actually want before
>     you start to push for deployment. Also, deployment impacts should
>     be taken a lot into account when deciding what to do.
>
>     So how sure are you that L4S as it currently stands is the way to
>     go? If you think you're going to invent something new in 2-3
>     years, then please wait until then. Experimentation is all fine
>     and dandy, but until we can actually get DSCP codepoints working
>     on Internet-wide scale, this approach isn't feasable for that
>     use-case (which for me is close to "the only" use-case).
>
>     My proposal has been before that we should try to get 7 DSCP
>     codepoints deployed by using 000xxx, and nudge providers to
>     incrementally just not bleach them and treat them as BE in their
>     core networks, so we can use them on the edge to influence AQM there.
>
>     So, if we're going to invent new meaning of ECN bits in
>     combination with DSCP, then that needs to be coupled with work of
>     getting some DSCP working Internet-wide in a fashion that someone
>     actually believes will work out, as in actually getting
>     significant Internet-wide deployment.
>
>     -- 
>     Mikael Abrahamsson    email: swmike@swm.pp.se
>     <mailto:swmike@swm.pp.se>
>
>
>     _______________________________________________
>     aqm mailing list
>     aqm@ietf.org <mailto:aqm@ietf.org>
>     https://www.ietf.org/mailman/listinfo/aqm
>
>
>
>
> -- 
> Andrew McGregor | SRE |andrewmcgr@google.com 
> <mailto:andrewmcgr@google.com> | +61 4 1071 2221
>
>
> _______________________________________________
> aqm mailing list
> aqm@ietf.org
> https://www.ietf.org/mailman/listinfo/aqm

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/