Re: [aqm] think once to mark, think twice to drop: draft-ietf-aqm-ecn-benefits-02
David Lang <david@lang.hm> Mon, 30 March 2015 04:16 UTC
Return-Path: <david@lang.hm>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 356711A8F44 for <aqm@ietfa.amsl.com>; Sun, 29 Mar 2015 21:16:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.79
X-Spam-Level:
X-Spam-Status: No, score=0.79 tagged_above=-999 required=5 tests=[BAYES_50=0.8, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7ZxwoJSt3Ab6 for <aqm@ietfa.amsl.com>; Sun, 29 Mar 2015 21:16:23 -0700 (PDT)
Received: from bifrost.lang.hm (mail.lang.hm [64.81.33.126]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BF5EB1A8F43 for <aqm@ietf.org>; Sun, 29 Mar 2015 21:16:23 -0700 (PDT)
Received: from asgard.lang.hm (asgard.lang.hm [10.0.0.100]) by bifrost.lang.hm (8.13.4/8.13.4/Debian-3) with ESMTP id t2U4G3V2010467; Sun, 29 Mar 2015 20:16:03 -0800
Date: Sun, 29 Mar 2015 21:16:03 -0700
From: David Lang <david@lang.hm>
X-X-Sender: dlang@asgard.lang.hm
To: "Scheffenegger, Richard" <rs@netapp.com>
In-Reply-To: <AE342093-DE05-4D93-96DA-EB07E221F1D9@netapp.com>
Message-ID: <alpine.DEB.2.02.1503291923300.26044@nftneq.ynat.uz>
References: <23AFEFE3-4D93-4DD9-A22B-952C63DB9FE3@cisco.com> <BF6B00CC65FD2D45A326E74492B2C19FB75BAA82@FR711WXCHMBA05.zeu.alcatel-lucent.com> <72EE366B-05E6-454C-9E53-5054E6F9E3E3@ifi.uio.no> <55146DB9.7050501@rogers.com> <08C34E4A-DFB7-4816-92AE-2ED161799488@ifi.uio.no> <BF6B00CC65FD2D45A326E74492B2C19FB75BAFA0@FR711WXCHMBA05.zeu.alcatel-lucent.com> <alpine.DEB.2.02.1503271024550.2416@nftneq.ynat.uz> <5d58d2e21400449280173aa63069bf7a@hioexcmbx05-prd.hq.netapp.com> <20150327183659.GI39886@verdi> <72C12F6B-9DDE-4483-81F2-2D9A0F2D3A48@cs.columbia.edu> <alpine.DEB.2.02.1503271211200.19390@nftneq.ynat.uz> <D13AFCE7.46BC%kk@cs.ucr.edu>, <alpine.DEB.2.02.1503271257230.19390@nftneq.ynat.uz> <AE342093-DE05-4D93-96DA-EB07E221F1D9@netapp.com>
User-Agent: Alpine 2.02 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"; format="flowed"
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/pF7z8xbzSy-y24o0UDHbaQ1EX1k>
Cc: John Leslie <john@jlc.net>, KK <kk@cs.ucr.edu>, "aqm@ietf.org" <aqm@ietf.org>, Vishal Misra <misra@cs.columbia.edu>
Subject: Re: [aqm] think once to mark, think twice to drop: draft-ietf-aqm-ecn-benefits-02
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 30 Mar 2015 04:16:26 -0000
On Sat, 28 Mar 2015, Scheffenegger, Richard wrote: > David, > > Perhaps you would care to provide some text to address the misconception that > you pointed out? (To wait for a 100% fix as a 90% fix appears much less > appealing, while the current state of art is at 0%) Ok, you put me on the spot :-) Here goes. > If you think that aqm-recommendations is not strogly enough worded. I think > this particular discussion (to aqm or not) really belongs there. The other > document (ecn benefits) has a different target in arguing for going those last > 10%... so here is my "elevator pitch" on the problem. Feel free to take anything I say here for any purpose, and I'm sure I'll get corrected for anything I am wong on Problem statement: Transmit buffers are needed to keep the network layer fully utilized, but excessive buffers result in poor latency for all traffic. This latency is frequently bad enough to cause some types of traffic to fail entirely. <link to more background goes here, including how separate benchmarks for throughput and latency have mislead people, "packet loss considered evil", cheaper memory encouraging larger buffers, etc. Include tests like netperf-wrapper and ping latency while under load, etc. Include examples where buffers have resulted in latencies so long that packets are retransmitted before the first copy gets to the destination> Traditionally, transmit buffers have been sized to handle a fixed number of packets. Due to teh variation in packet sizes, it is impossible to tune this value to both keep the link fully utilized when small packets dominate the trafific without having the queue size be large enough to cause latency problems when large packets dominate the traffic. Shifting to Byte Queue Lengths where queues are allowed to hold a variable number of packets depending on how large they are makes it possible to manually tune the transmit buffer size to get good latency under all traffic conditions at a given speed. However, this step forward revealed two additional problems. 1. whenever the data rate changes, this value needs to be manually changed (multi-link paths loose a link, noise degrades max throughput on a link, etc) 2. high volume flows (i.e. bulk downloads) can starve other flows (DNS lookups, VoIP, Gaming, etc). this happens because space in tue queue is on a first-com-first-served basis, so the high-volume traffic fills the queue (at which point it starts to be dropped), but all other traffic that tries to arrive is also dropped. It turns out that these light flows tend to have a larger effect on the user experience than heavier flows, because things tend to be serialized behind the lighter flows (DNS lookup before doing a large download, retrieving a small HTML page to find what additional resources need to be fetched to display a page), or the user experience is directly effected by light flows (gaming lag, VoIP drops, etc) Active Queue Management addresses these problems by adapting the amount of data that is buffered to match the data transmission capacity, and prevents high volume flows from starving low-volume flows without the need to implement QoS classifications. <insert link about how you can't trust QoS tags that are made by other organizations, ways that it can be abused, etc> This is possible because AQM algoithms don't have to drop the new packet that arrives, the algorithm can decide to drop the packet for one of the heavy flows rather than for one of the lightweight flows. <insert references to currently favored AQM options here, PIE, fq_codel, cake, ???. Also links to failed approaches> Turning on aqm on every bottleneck link makes the Internet usable for everyone, no matter what sort of application they are using. <insert link on how to deal with equipment you can't configure by throttling bandwidth before the bottleneck oand/or doing ingress shaping of traffic> While AQM makes the network usable, there is still additional room for improvement. While dropping packets does result in the TCP senders slowing down,and eventually stabilizing at around the right speed to keep the link fully utilized, the only way that senders have been able to detect problems is to discover that they have not received an ack for the traffic within the allowed time. This causes a 'bubble' in the flow as teh dropped packet must be retransmitted (and sometimes a significant amount of data after the dropped packet that did make it to the destination, but could not be acked because fo the missing packet). This "bubble" in the data flow can be greatly compressed by configuring the AQM algorithm to send an ECN packet to the sender when it drops a packet in a flow. The sender can then adapt faster, slowing down it's new data, and re-sending the dropped packet without having to wait for the timeout. This has two major effects by allowing the sender to retransmit the packet sooner the dealy on the dropped data is not as long, and because the replacement data can arrive before the timeout of the following packets, they may not need to be re-sent. by configuring the AQM algorithm to send the ECN notification to the sender only when the packet is being dropped, the effect of failure of the ECN packet to get through to the sender (the notification packet runs into congestion and gets dropped, some network device blocks it, etc) is that the ECN enabled case devolves to match the non-ECN case in that the sender will still detect the dropped packet via the timeout waiting for the ack as if ENCN was not enabled. <insert link to possible problems that can happen here, including the potential for an app to 'game' things if packets are marked at a different level than when they are dropped.> So a very strong recommendation to enable Active Queue Management, while the different algorithms have different advantages and levels of testing, even the 'worst' of the set results in a night-and-day improvement for usability compared to unmanaged buffers. Enabling ECN at the same point as dropping packets as part of enabling any AQM algorithm results in a noticable improvement over the base algorithm without ECN. When compared to the baseline, the improvement added by ECN is tiny compared to the improvement from enabling AQM. Is it fair to say that plain aqm vs aqm+ecn variation is on the same order of difference as the differences between the different AQM algorithms? Future research items (which others here may already have done, and would not be part of my 'elevator pitch') I believe that currently ECn triggers the exact same slowdown that a missed packet does, and it may be appropriate to have the sender do a less drastic slowdown. It would be very interesing to provide soem way for the application sending the traffic to detect dropped packets and ECN responses. For example, a streaming media source (especially an interactive one like video conferencing) could adjust the bitrate that it's sending. David Lang
- [aqm] review of draft-ietf-aqm-ecn-benefits-02 Fred Baker (fred)
- [aqm] think once to mark, think twice to drop: dr… De Schepper, Koen (Koen)
- Re: [aqm] think once to mark, think twice to drop… Michael Welzl
- Re: [aqm] think once to mark, think twice to drop… David Collier-Brown
- Re: [aqm] think once to mark, think twice to drop… Michael Welzl
- Re: [aqm] think once to mark, think twice to drop… De Schepper, Koen (Koen)
- Re: [aqm] think once to mark, think twice to drop… David Lang
- Re: [aqm] think once to mark, think twice to drop… Scheffenegger, Richard
- Re: [aqm] think once to mark, think twice to drop… John Leslie
- Re: [aqm] think once to mark, think twice to drop… Vishal Misra
- Re: [aqm] think once to mark, think twice to drop… David Lang
- Re: [aqm] think once to mark, think twice to drop… David Lang
- Re: [aqm] think once to mark, think twice to drop… KK
- Re: [aqm] think once to mark, think twice to drop… David Lang
- Re: [aqm] think once to mark, think twice to drop… Vishal Misra
- Re: [aqm] think once to mark, think twice to drop… Scheffenegger, Richard
- Re: [aqm] think once to mark, think twice to drop… De Schepper, Koen (Koen)
- Re: [aqm] think once to mark, think twice to drop… David Lang
- Re: [aqm] think once to mark, think twice to drop… Bob Briscoe
- Re: [aqm] think once to mark, think twice to drop… David Lang
- Re: [aqm] think once to mark, think twice to drop… John Leslie
- [aqm] Gaming ECN (again) (was: think once to mark… Bob Briscoe
- Re: [aqm] Gaming ECN (again) (was: think once to … David Lang