Re: [aqm] IETF88 Fri 08Nov13 - 12:30 Regency B

"Akhtar, Shahid (Shahid)" <shahid.akhtar@alcatel-lucent.com> Thu, 07 November 2013 20:44 UTC

Return-Path: <shahid.akhtar@alcatel-lucent.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D677C21E81AF for <aqm@ietfa.amsl.com>; Thu, 7 Nov 2013 12:44:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.374
X-Spam-Level:
X-Spam-Status: No, score=-10.374 tagged_above=-999 required=5 tests=[AWL=0.225, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id e2cP2iVUfxpZ for <aqm@ietfa.amsl.com>; Thu, 7 Nov 2013 12:44:52 -0800 (PST)
Received: from ihemail1.lucent.com (ihemail1.lucent.com [135.245.0.33]) by ietfa.amsl.com (Postfix) with ESMTP id 85DF021E817D for <aqm@ietf.org>; Thu, 7 Nov 2013 12:44:52 -0800 (PST)
Received: from us70tusmtp1.zam.alcatel-lucent.com (h135-5-2-63.lucent.com [135.5.2.63]) by ihemail1.lucent.com (8.13.8/IER-o) with ESMTP id rA7KiaXd018472 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Thu, 7 Nov 2013 14:44:37 -0600 (CST)
Received: from US70UWXCHHUB02.zam.alcatel-lucent.com (us70uwxchhub02.zam.alcatel-lucent.com [135.5.2.49]) by us70tusmtp1.zam.alcatel-lucent.com (GMO) with ESMTP id rA7KiLtf008238 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Thu, 7 Nov 2013 15:44:35 -0500
Received: from US70UWXCHMBA01.zam.alcatel-lucent.com ([169.254.7.204]) by US70UWXCHHUB02.zam.alcatel-lucent.com ([135.5.2.49]) with mapi id 14.02.0247.003; Thu, 7 Nov 2013 15:44:29 -0500
From: "Akhtar, Shahid (Shahid)" <shahid.akhtar@alcatel-lucent.com>
To: "Fred Baker (fred)" <fred@cisco.com>
Thread-Topic: IETF88 Fri 08Nov13 - 12:30 Regency B
Thread-Index: Ac7aemDyYVICOG/TQKCzUJnt0vYtGgAxY8ewADTbyQD//82PKQ==
Date: Thu, 07 Nov 2013 20:44:28 +0000
Message-ID: <6CE3D838-64CB-42C2-B6CB-B07424C8EFB2@alcatel-lucent.com>
References: <012C3117EDDB3C4781FD802A8C27DD4F25E6A85C@SACEXCMBX02-PRD.hq.netapp.com> <C0611F522A6FA9498A044C36073E49657E5FF524@US70UWXCHMBA01.zam.alcatel-lucent.com>, <5B72ED36-A189-4C01-80DB-F6D2F247CDDF@cisco.com>
In-Reply-To: <5B72ED36-A189-4C01-80DB-F6D2F247CDDF@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.33
Cc: Gorry Fairhurst <gorry@erg.abdn.ac.uk>, Richard Scheffenegger <rs@netapp.com>, Wesley Eddy <wes@mti-systems.com>, "aqm@ietf.org" <aqm@ietf.org>, "Naeem Khademi (naeemk@ifi.uio.no)" <naeemk@ifi.uio.no>
Subject: Re: [aqm] IETF88 Fri 08Nov13 - 12:30 Regency B
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Nov 2013 20:44:58 -0000

Fred, All,

I am traveling today - so should respond by end of day.

- Shahid.

Sent from my iPhone

On Nov 7, 2013, at 10:45 AM, "Fred Baker (fred)" <fred@cisco.com> wrote:

> 
> On Nov 7, 2013, at 8:59 AM, "Akhtar, Shahid (Shahid)" <shahid.akhtar@alcatel-lucent.com> wrote:
> 
>> Hi All,
>> 
>> Had some comments on Fred's document. I have added the comments as track changes in a word document to easily see them. I used the 02 version. 
>> 
>> Thanks.
> 
> 
> Permit me to put your comments in email, along with my own views. Also adding my co-author and the other working group chair on the CC line; if he is like me, he receives far too much email, and mail that is explicitly to or copies him bubbles higher in the column.
> 
>>> 4.  Conclusions and Recommendations
>>>  [snip] 
>>>   3.  The algorithms that the IETF recommends SHOULD NOT require
>>>       operational (especially manual) configuration or tuning.
>> 
>> Some tuning may be required or implicitly assumed for virtually all AQMs – please see my comment later.
> 
> That's an opinion. One of the objectives of Van and Kathy's work, and separately of Rong Pan et al's work, is to design an algorithm that may have different initial conditions drawn from a table given the interface it finds itself on, but requires no manual tuning. The great failure of RED, recommended in RFC 2309, is not that it doesn't work when properly configured; it's that real humans don't have the time to properly tune it differently for each of the thousands of link endpoints in their networks. There is no point in changing away from RED if that is also true of the replacement.
> 
>>>   7.  Research, engineering, and measurement efforts are needed
>>>       regarding the design of mechanisms to deal with flows that are
>>>       unresponsive to congestion notification or are responsive, but
>>>       are more aggressive than present TCP.
>> 
>>      Do we want to make a suggestion on how to configure buffer sizes with each type of AQM here (e.g. 2xBDP etc) or simply state that research should be conducted on the best buffer sizes to use with AQM.
> 
> I'm not sure that buffer sizes are specific to AQM algorithms; I'd entertain evidence otherwise. Buffer *thresholds* ("at what point do we start dropping/marking traffic?") may differ between algorithms. Buffer size ("how many bytes/packets do we allow into the queue in the worst case?") is a matter of the characteristics of burst behavior in a given network and the applications it supports. If I have, say, a Map/Reduce application that simultaneously asks thousands of systems a question, the queues in the intervening switches will need to be able to briefly absorb thousands of response packets. The key word here is "briefly". When Van or Kathy talk about "good queue" and "bad queue", they are saying that burst behavior may call for deep queues, but we really want the steady state to achieve 100% utilization with a statistically empty queue if we can possibly achieve that.
> 
>>> 4.3.  AQM algorithms deployed SHOULD NOT require operational tuning
>>> 
>>>   A number of algorithms have been proposed.  Many require some form of
>>>   tuning or initial condition.  This can make them difficult to use
>>>   operationally.  Hence, self-tuning algorithms are to be preferred.
>>>   The algorithms that the IETF recommends SHOULD NOT require
>>>   operational (especially manual) configuration or tuning.
>> 
>> May be better to state that tuning should be minimized. For the second sentence “The algorithms that the IETF recommends should minimize tuning or configurations changes for specific traffic or network conditions”
>> 
>> I would argue that all AQMs will likely require or assume some type of configuration/tuning.
>> 
>> For example, if we take Codel:
>> 
>> ·       For small thin links, such as 1-10Mbs DSL, the 5ms target would increase packet loss significantly and at 2Mbps, a single MTU time may even exceed the 5ms target.
>> 
>> ·       If the average RTT of all flows going through a link is more than 500ms, e.g. for satellite, then the 100ms interval would prematurely drop packets before the sources have had a chance to reduce their sending rate. Or if the average RTT is very low – e.g. 10ms – such as for flows between data-center elements, then 100ms interval may be slow to signal congestion back to the sources and significant packet loss may have occurred before such signaling.
> 
> 
> What you describe is what I referred to as "initial conditions derived from the links the algorithm finds itself on". We may be in violent agreement there. If so, the wording I might suggest would be "SHOULD NOT require operational (especially manual) configuration or tuning apart from automated determination of initial conditions" or some such thing.
> 
>>> 4.7.  The need for further research
>>> 
>>>   The second recommendation of [RFC2309] called for further research in
>>>   the interaction between network queues and host applications, and the
>>>   means of signaling between them.  This research has occurred, and we
>>>   as a community have learned a lot.  However, we are not done.
>>> 
>>>   We have learned that the problems of congestion, latency and buffer-
>>>   sizing have not gone away, and are becoming more important to many
>>>   users.  A number of self-tuning AQM algorithms have be found that
>>>   offer significant advantages for deployed networks.  There is also
>>>   renewed interest in deploying AQM and the potential of ECN.
>>> 
>>>   An obvious example of further research in 2013 is the need to
>>>   consider the use of Map/Reduce applications in data centers; do we
>>>   need to extend our taxonomy of TCP/SCTP sessions to include not only
>>>   "mice" and "elephants", but "lemmings"?  "Lemmings" are flash crowds
>>>   of "mice" that the network inadvertently tries to signal to as if
>>>   they were elephant flows, resulting in head of line blocking in data
>>>   center applications.
>> 
>> Such research should also focus on improving end user QoE from AQMs rather than network related metrics only. Often a significant change in a network metric may only make a minimal change in end-user QoE and thus the value of such change may be minimal.
> 
> 
> There is also a question of what user is under discussion. If you take a look at http://www.ietf.org/proceedings/88/slides/slides-88-aqm-0.pptx and specifically the third slide, you will see (and tomorrow afternoon I will discuss) a capture I took of pings from my home network to another site overnight. In the evening, we watched a movie (home network), and in the morning I had a video conference (office network). I'll tell you right now that both worked fine, and managing the delay to zero or allowing it to be one hundred ms would not have materially affected the QOE of either. However, my ping is a competing application, and it saw sustained increase in delay, variation in delay, and the possibility of loss as a result of the queuing. The value of AQM is in part to the application being throttled, but in large part to competing applications, and the QOE of both must be considered.
> 
>>>   Examples of other required research include:
>>> 
>>>   o  Research into new AQM and scheduling algorithms.
>>> 
>>>   o  Research into the use of and deployment of ECN alongside AQM.
>>> 
>>>   o  Tools for enabling AQM (and ECN) deployment and measuring the
>>>      performance.
>>> 
>>>   o  Methods for mitigating the impact of non-conformant and malicious
>>>      flows.
>> 
>> Methods or configurations that leverage deployed AQMs such as RED/WRED to reduce delays and lockout for typical traffic which require minimal effort or tuning from the operator.
> 
> Not a complete sentence, but I think I understand what you're getting at. You would like to have research determine how to easily configure existing systems using the tools at hand. I'm all for it in the near term.