Re: [aqm] Gen-art LC review of draft-ietf-aqm-recommendation-08

"Fred Baker (fred)" <fred@cisco.com> Fri, 09 January 2015 21:30 UTC

Return-Path: <fred@cisco.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BC5901A044D; Fri, 9 Jan 2015 13:30:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -114.011
X-Spam-Level:
X-Spam-Status: No, score=-114.011 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, GB_ABOUTYOU=0.5, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id r8bBIpve5AtB; Fri, 9 Jan 2015 13:30:08 -0800 (PST)
Received: from alln-iport-1.cisco.com (alln-iport-1.cisco.com [173.37.142.88]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BC2D81A0041; Fri, 9 Jan 2015 13:30:07 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=30601; q=dns/txt; s=iport; t=1420839007; x=1422048607; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-id:content-transfer-encoding: mime-version; bh=LG0R+qAjqph2RKatILnaBlM8J2HLFNAHtRq4mQy+U9A=; b=Ecyqw9txces9GiEadcBi1yAkp1r1NDutzai/ACDM5TRcSdYQac7/om1x rWNxSWzVM3YvREMyObIcqGNm6beMOIW3fu8pB1E7asNOa7ToLs8AzuQvT 5IBw/rIlZCc5HXecaEpnKXEKRjhJmgUKAt3l6CbtZwXqmn94W7pqFbAnP 8=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Am8JALtHsFStJV2d/2dsb2JhbABSAggOCIJwUlgExRtqhW8CgRVDAQEBAQF9hAwBAQECAQEaA04DCwULAgEIFAQuMiUCBAoEAwIJiBsIDcpSAQEBAQEBAQEBAQEBAQEBAQEBAQEBF4oPhQ4FASMzB4MWgRMFhDoCiXyDRIIFg0KBDjCCQYI4iAODOiKDMT1CLQEBgQJBfgEBAQ
X-IronPort-AV: E=Sophos;i="5.07,732,1413244800"; d="scan'208";a="111879807"
Received: from rcdn-core-6.cisco.com ([173.37.93.157]) by alln-iport-1.cisco.com with ESMTP; 09 Jan 2015 21:30:05 +0000
Received: from xhc-aln-x07.cisco.com (xhc-aln-x07.cisco.com [173.36.12.81]) by rcdn-core-6.cisco.com (8.14.5/8.14.5) with ESMTP id t09LU5su022852 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Fri, 9 Jan 2015 21:30:05 GMT
Received: from xmb-rcd-x09.cisco.com ([169.254.9.211]) by xhc-aln-x07.cisco.com ([173.36.12.81]) with mapi id 14.03.0195.001; Fri, 9 Jan 2015 15:30:05 -0600
From: "Fred Baker (fred)" <fred@cisco.com>
To: "gorry@erg.abdn.ac.uk" <gorry@erg.abdn.ac.uk>
Thread-Topic: Gen-art LC review of draft-ietf-aqm-recommendation-08
Thread-Index: AQHQKMfWNhV+PAOc206GqrQHI+1oJg==
Date: Fri, 09 Jan 2015 21:30:04 +0000
Message-ID: <199CFC64-D18D-4D54-8C8D-5ADA9AAEB2C3@cisco.com>
References: <54947DCF.3030601@scss.tcd.ie> <40842d620667e7d2a33f451dcd8f502b.squirrel@spey.erg.abdn.ac.uk> <30819CFE-21D3-4EF8-ABFE-4C01940399B7@cisco.com> <54ADC3F5.3040706@dial.pipex.com> <704AB199-DA52-4B23-BB9A-5049B03763E0@cisco.com> <9195362aada3785a2e3a014ca028a196.squirrel@spey.erg.abdn.ac.uk>
In-Reply-To: <9195362aada3785a2e3a014ca028a196.squirrel@spey.erg.abdn.ac.uk>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.19.64.117]
Content-Type: text/plain; charset="Windows-1252"
Content-ID: <11D92F06028455438A2F5A21DEC225C7@emea.cisco.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/b5M4FwBF9zEstLNQC4k1f8L3-Xg>
Cc: "draft-ietf-aqm-recommendation.all@tools.ietf.org" <draft-ietf-aqm-recommendation.all@tools.ietf.org>, Elwyn Davies <elwynd@dial.pipex.com>, General area reviewing team <gen-art@ietf.org>, "aqm@ietf.org" <aqm@ietf.org>
Subject: Re: [aqm] Gen-art LC review of draft-ietf-aqm-recommendation-08
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Jan 2015 21:30:13 -0000

Your suggestions are all fine with me. When it comes to you not knowing what to change, I don’t know either. That ball’s in Elwyn’s court.

> On Jan 8, 2015, at 1:40 AM, gorry@erg.abdn.ac.uk wrote:
> 
> I have made some suggested changes in-line below, please see if these
> may resolve these issues.
> 
> Gorry
> 
>> 
>>> On Jan 7, 2015, at 3:40 PM, Elwyn Davies <elwynd@dial.pipex.com> wrote:
>>> 
>>> (Copied to aqm mailing list as suggested by WG chair).
>>> Hi.
>>> 
>>> Thanks for your responses.  Just a reminder... I am not (these days,
>>> anyway) an expert in router queue management, so my comments should not
>>> be seen as deep critique of the individual items, but things that come
>>> to mind as matters of general control engineering and areas where I feel
>>> the language needs clarification  - that's what gen-art is for,
>>> 
>>> As a matter of interest it might be useful to explain a bit what scale
>>> of routing engine you are thinking about in this paper.  This is because
>>> I got a feeling from your responses to the buffer bloat question that
>>> you are primarily thinking big iron here.  The buffer bloat phenomenon
>>> has tended to be in smaller boxes where the AQM stuff may or may not be
>>> applicable.
>> 
>> If Dave Taht replies, he'll mention OpenWRT. Personally, I am thinking
>> about ISP interconnect points and Access equipment such as BRAS and CMTS,
>> which are indeed big iron, but also small CPEs such as might implement
>> Homenet technologies and OpenWRT.
>> 
> GF: I'm not sure this helps in the BCP - but has to be considered when you
> define algorithms for methods - hence the recommendation for wide
> applicability, but I tend to think the details need to go into any AQM
> evaluation guidelines or the specs themselves.
> ---
> 
>>> I don't quite know what your target is here - or if you are thinking
>>> over the whole range of sizes.  The responses below clearly indicate
>>> that you have some examples in mind (Codel, for example which I know
>>> nothing about except (now) that it is an AQM WG product) and I don't
>>> know what scale of equipment these are really relevant to.
>>> 
>>> Some more responses in line.
>>> 
>>> Regards,
>>> Elwyn
>>> 
>>> On 05/01/15 20:32, Fred Baker (fred) wrote:
>>>> 
>>>>> On Jan 5, 2015, at 1:13 AM, gorry@erg.abdn.ac.uk wrote:
>>>>> 
>>>>> Fred, I've applied the minor edits.
>>>>> 
>>>>> I have questions to you on the comments blow (see GF:) before I
>>>>> proceed.
>>>>> 
>>>>> Gorry
>>>> 
>>>> Adding Elwyn, as the discussion of his comments should include him -
>>>> he might b able to clarify his concerns. I started last night to
>>>> write a note, which I will now discard and instead comment here.
>>>> 
>>>>>> I am the assigned Gen-ART reviewer for this draft. For background
>>>>>> on Gen-ART, please see the FAQ at
>>>>>> 
>>>>>> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
>>>>>> 
>>>>>> Please resolve these comments along with any other Last Call
>>>>>> comments you may receive.
>>>>>> 
>>>>>> Document: draft-ietf-aqm-recommendation-08.txt Reviewer: Elwyn
>>>>>> Davies Review Date: 2014/12/19 IETF LC End Date: 2014/12/24 IESG
>>>>>> Telechat date: (if known) -
>>>>>> 
>>>>>> Summary:  Almost ready for BCP.
>>>>>> 
>>>>>> Possibly missing issues:
>>>>>> 
>>>>>> Buffer bloat:  The suggestions/discussions are pretty much all
>>>>>> about keeping buffer size sufficiently large to avoid burst
>>>>>> dropping.  It seems to me that it might be good to mention the
>>>>>> possibility that one can over provision queues, and this needs to
>>>>>> be avoided as well as under provisioning.
>>>>>> 
>>>>> GF: I am not sure - this to me depends use case.
>>>> 
>>>> To me, this is lily-gilding. To pick one example, the Cisco ASR 8X10G
>>>> line card comes standard from the factory with 200 ms of queue per
>>>> 10G interface. If we were to implement Codel on it, Codel would try
>>>> desperately to keep the average induced latency less than five ms. If
>>>> it tried to make it be 100 microseconds, we would run into the issues
>>>> the draft talks about - we're trying to maximize rate while
>>>> minimizing mean latency, and due to TCP's dynamics, we would no
>>>> longer maximize rate. If 5 ms is a reasonable number (and for
>>>> intra-continental terrestrial delays I would think it is), and we set
>>>> that variable to 10, 50, or 100 ms, the only harm would be that we
>>>> had some probability of a higher mean induced latency than was really
>>>> necessary - AQM would be a little less effective. In the worst case,
>>>> (suppose we set Codel's limit to 200 ms), it would revert to tail
>>>> drop, which is what we already have.
>>>> 
>>>> There are two reasonable responses to this. One would be to note that
>>>> high RTT cases, even if auto-tuning mostly works, manual tuning may
>>>> deliver better results or tune itself correctly more quickly (on a
>>>> 650 ms RTT satcom link, I'd start by changing Codel's 100 ms trigger
>>>> to something in the neighborhood of 650 ms). The other is to simply
>>>> say that there is no direct harm in increasing the limits, and there
>>>> may be value in some use cases. But I would also tend to think that
>>>> anyone that actually operates a network already has a pretty good
>>>> handle on that fact. So I don't see the value in saying it - which is
>>>> mostly why it's not there already.
>> 
>>> My take on this would be "make as few assumptions about your audience as
>>> possible, and write them down".  Its a generally interesting topic and
>>> would interest people who are not deeply skilled in the art - as well as
>>> potentially pulling in some new researchers!
>> 
>> I'm still not entirely sure what you'd like to have said. Is this a one
>> sentence "setting limits looser than necessary is neither harmful nor
>> helpful", or a treatise on the topic?
>> 
> GF:
> The text block I think this relates to is:
>     "It is expected that the principles and guidance
>      are also applicable to a wide range of environments, but may require
>      tuning for specific types of link/network (e.g. to accommodate the
>      traffic patterns found in data centres, the challenges of wireless
>      infrastructure, or the higher delay encountered on satellite Internet
>      links). "
> 
> - We could for instance add one or two more explicit examples here.
> e.g. based on the text above... do we need examples also?
> ----
>>>>>> Interaction between boxes using different or the same algorithms:
>>>>>> Buffer bloat seems to be generally about situations where chains
>>>>>> of boxes all have too much buffer.  One thing that is not
>>>>>> currently mentioned is the possibility that if different AQM
>>>>>> schemes are implemented in various boxes through which a flow
>>>>>> passes, then there could be inappropriate interaction between the
>>>>>> different algorithms.  The old RFC suggested RED and nothing else
>>>>>> so that one just had one to make sure multiple RED boxes in
>>>>>> series didn't do anything bad.  With potentially different
>>>>>> algorithms in series, one had better be sure that the mechanisms
>>>>>> don't interact in a bad way when chained together - another
>>>>>> research topic, I think.
>>>>> 
>>>>> GF: I think this could be added as an area for continued research
>>>>> mentioned in section 4.7. At least I know of some poor
>>>>> interactions between PIE and CoDel on particular paths - where both
>>>>> algorithms are triggered. However, I doubt if this is worth much
>>>>> discussion in this document? thoughts?
>>>>> 
>>>>> Suggest: "The Internet presents a wide variety of paths where
>>>>> traffic can experience combinations of mechanisms that can
>>>>> potentially interact to influence the performance of applications.
>>>>> Research therefore needs to consider the interactions between
>>>>> different AQM algorithms, patterns of interaction in network
>>>>> traffic and other network mechanisms to ensure that multiple
>>>>> mechanisms do not inadvertently interact to impact performance."
>>>> 
>>>> Mentioning it as a possible research area makes sense. Your proposed
>>>> text is fine, from my perspective.
>>>> 
>>> Yes. I think something like this would be good.   The buffer bloat
>>> example is probably an extreme case of things not having AQM at all and
>>> interacting badly.  It would maybe be worth mentioning that any AQM
>>> mechanism has also got to work in series with boxes that don't have any
>>> active AQM - just tail drop. Ultimately, I would say this is just a
>>> matter of control engineering principles:  You are potentially making a
>>> network in which various control algorithms are implemented on different
>>> legs/nodes and the combination of transfer functions could possibly be
>>> unstable.  Has anybody applied any of the raft of control theoretic
>>> methods to these algorithms?  I have no idea!
>> 
>> Well, PIE basically came out of control theory (a basic equation that
>> describes a phase-locked loop), and I believe that Van and Kathy will say
>> something similar about Codel. But that's not a question for this paper,
>> it's a question for the various algorithmic papers.
>> 
> GF: I wonder if this is probably something that needs to go into specific
> algorithm specs, not the BCP on the topic?
> ----
>>>> I start by questioning the underlying assumption, though, which is
>>>> that bufferbloat is about paths in which there are multiple
>>>> simultaneous bottlenecks. Yes, that occurs (think about paths that
>>>> include both Cogent and a busy BRAS or CMTS, or more generally, if
>>>> any link has some probability of congesting, math sophomore
>>>> statistics course maintained that any pair of links has the product
>>>> of the two probabilities of being simultaneously congested), but I'd
>>>> be hard-pressed to make a statistically compelling argument out of
>>>> it. The research and practice I have seen has been about a single
>>>> bottleneck.
>>> Please don't fixate on buffer bloat!
>> 
>> ?
>> 
>> AQM is absolutely fixated on buffer bloat. We have called it a lot of
>> things over the years, none of them very pleasant, but the fundamental
>> issue in RFC 2309 and V+K's RED work was maximizing throughput while
>> minimizing queue occupancy.
>> 
>>>>>> Minor issues: s3, para after end of bullet 3:
>>>>>>> The projected increase in the fraction of total Internet
>>>>>>> traffic for more aggressive flows in classes 2 and 3 could pose
>>>>>>> a threat to the performance of the future Internet.  There is
>>>>>>> therefore an urgent need for measurements of current conditions
>>>>>>> and for further research into the ways of managing such flows.
>>>>>>> This raises many difficult issues in finding methods with an
>>>>>>> acceptable overhead cost that can identify and isolate
>>>>>>> unresponsive flows or flows that are less responsive than TCP.
>>>>>> 
>>>>>> Question: Is there actually any published research into how one
>>>>>> would identify class 2 or class 3 traffic in a router/middle box?
>>>>>> If so it would be worth noting - the text call for "further
>>>>>> research" seems to indicate there is something out there.
>>>>>> 
>>>>> GF: I think the text is OK.
>>>> 
>>>> Agreed. Elwyn's objection appears to be to the use of the word
>>>> "further"; if we don't know of a paper, he'd like us to call for
>>>> "research". The papers that come quickly to my mind are various
>>>> papers on non-responsive flows, such as
>>>> http://www.icir.org/floyd/papers/collapse.may99.pdf or
>>>> http://www2.research.att.com/~jiawang/sstp08-camera/SSTP08_Pan.pdf.
>>>> We already have a pretty extensive bibliography...
>>> 
>>> Right either remove/alter "further" if there isn't anything already out
>>> there or put in some reference(s).
>> 
>> OK, Gorry. You have two papers there to refer to. There are more, but two
>> should cover it.
>> 
> GF: I think the papers provide background, but I'm not sure they capture
> the possibilities well - I suspect there is much more out there in products
> patents and labs...The words further research, I think need to be read as
> "encourage" research,
> 
> I suggest it may be best simply to remove /further/ in
> this specific sentence.
> ----
>>>>>> s4.2, next to last para: Is it worth saying also that the
>>>>>> randomness should avoid targeting a single flow within a
>>>>>> reasonable period to give a degree of fairness.
>> 
>> The text from the spec is:
>> 
>>     Network devices SHOULD use an AQM algorithm to determine the packets
>>     that are marked or discarded due to congestion.  Procedures for
>>     dropping or marking packets within the network need to avoid
>>     increasing synchronization events, and hence randomness SHOULD be
>>     introduced in the algorithms that generate these congestion signals
>>     to the endpoints.
>> 
>>>>> GF: Thoughts?
>>>> 
>>>> I worry. The reasons for the randomness are (1) to tend to hit
>>>> different sessions, and (2) when the same session is hit, to minimize
>>>> the probability of multiple hits in the same RTT. It might be worth
>>>> saying as much. However, to *stipulate* that algorithms should limit
>>>> the hit rate on a given flow invites a discussion of stateful
>>>> inspection algorithms. If someone wants to do such a thing, I'm not
>>>> going to try to stop them (you could describe fq_* in those terms),
>>>> but I don't want to put the idea into their heads (see later comment
>>>> on privacy). Also, that is frankly more of a concern with Reno than
>>>> with NewReno, and with NewReno than with anything that uses SACK.
>>>> SACK will (usually) retransmit all dropped segments in the subsequent
>>>> RTT, while NewReno will retransmit the Nth dropped packet in the Nth
>>>> following RTT, and Reno might take that many RTO timeouts.
>>> 
>>> You have thought about what I said.  Put in what you think it needs.
>> 
>> Well, I didn't think it needed a lot of saying. :-) How about this?
>> 
>> Network devices SHOULD use an AQM algorithm to measure local congestion
>> and determine which packets to mark or drop to manage congestion. In
>> general, dropping or marking multiple packets from the same sessions in
>> the same RTT is ineffective, and can negative consequences. Also, dropping
>> or marking packets from multiple sessions simultaneously has the effect of
>> synchronizing them, meaning that subsequent peaks and troughs in traffic
>> load are exacerbated. Hence, AQM algorithms should randomize dropping and
>> marking in time, to desynchronize sessions and improve overall algorithmic
>> effectiveness.
>> 
> I agree on drop, but not mark. Because, multiple CE marks in the same RTT
> could have usage with ECN - e.g. DCTCP. so I'd rather not prejudge what
> may happen in this space - as the WG starts to work on new ECN methods.
> 
> I suggest:
> 
> "Network devices SHOULD use an AQM algorithm to measure local congestion
> and determine which packets to mark or drop to manage congestion.
> 
> In general, dropping multiple packets from the same sessions in the same RTT
> is ineffective, and can reduce throughput. Also, dropping or marking packets
> from multiple sessions simultaneously can have the effect of synchronizing
> them,
> resulting in increasing peaks and troughs in the subsequent traffic load.
> Hence, AQM algorithms SHOULD randomize dropping and marking in time,
> to desynchronize sessions and improve overall algorithmic effectiveness."
> 
> - Is this useful?
> 
> ----
>>>>>> s4.2.1, next to last para:
>>>>>>> An AQM algorithm that supports ECN needs to define the
>>>>>>> threshold and algorithm for ECN-marking.  This threshold MAY
>>>>>>> differ from that used for dropping packets that are not marked
>>>>>>> as ECN-capable, and SHOULD be configurable.
>>>>>>> 
>>>>>> Is this suggestion really compatible with recommendation 3 and
>>>>>> s4.3 (no tuning)?
>>>>>> 
>>>>> GF: I think making a recommendation here is beyond the "BCP"
>>>>> experience, although I suspect that a lower marking threshold is
>>>>> generally good. Should we add it also to the research agenda as an
>>>>> item at the end of para 3 in S4.7.?
>>> 
>>> I think you may have misunderstood what I am saying here.  Rec 3 and
>>> s4.3 say things should work without tuning.  Doesn't having to set these
>>> thresholds/algorithms constitute tuning?  If so then it makes it
>>> difficult to see these ECN schemes as meeting the constraints.  If you
>>> disagree then explain how it isn't - or suggest  that there should be
>>> research to see how to make ECN zero config as well.
>> 
>> Well, I think you may have misunderstood the statement in the draft. The
>> big problem with RED, mandated in RFC 2309, was that it couldn't be
>> deployed without case by case tuning. We're trying to minimize that.
>> 
>> Recommendation 3 is:
>> 
>>   3.  The algorithms that the IETF recommends SHOULD NOT require
>>       operational (especially manual) configuration or tuning.
>> 
>> The title of section 4.3, which is the section explaining recommendation
>> 3, is "AQM algorithms deployed SHOULD NOT require operational tuning".
>> That's not "MUST NOT"; if there is a case, there is a case. It goes on to
>> say:
>> 
>>   o  SHOULD NOT require tuning of initial or configuration parameters.
>>      An algorithm needs to provide a default behaviour that auto-tunes
>>      to a reasonable performance for typical network operational
>>      conditions.  This is expected to ease deployment and operation.
>>      Initial conditions, such as the interface rate and MTU size or
>>      other values derived from these, MAY be required by an AQM
>>      algorithm.
>> 
>>   o  MAY support further manual tuning that could improve performance
>>      in a specific deployed network.
>> 
>> We're looking for "reasonable performance in the general case", and
>> allowing for the use of knobs to adjust that in edge cases.
>> 
>> Let me give you a case in point. Codel and PIE both work quite well
>> without tuning in intra-continental applications, which is to say use
>> cases in which RTT is on the order of tens of milliseconds. The larger the
>> RTT is, the longer it takes to adjust, but it gets there.
>> 
>> One problem we specifically saw with both algorithms as delays got larger
>> - especially satcom - is most easily explained using Codel. Codel starts a
>> season of marking/dropping when an interface has been continually
>> transmitting for 100 ms, meaning that the queue never emptied in 100 ms.
>> At that point, any packet that has been delayed for more than 5 ms has
>> some probability of being marked or dropped. TCP, in slow start, works
>> pretty hard to keep the buffer full - as full as it can make it. So now
>> I'm moving a large file (iperf, but you get the idea), and manage to fill
>> the queue with 50 ms of data and hold it there for 50 ms, with the
>> probable effect of dropping a packet near the tail of that burst. Now,
>> assume that I am on a geosynchronous satellite, so one way delay is on the
>> order of 325 ms. Codel, dropping a packet 1/3 of the way into getting that
>> going, had a horrible time even filling the link. PIE has the same issue,
>> but the underlying mechanism is different. If I'm on that kind of link,
>> I'm going to either turn AQM off or look for a way to tune it.
>> 
> GF: I don't know what to change.
> ----
>>>> I can see adding it to the research agenda; the comment comes from
>>>> Bob Briscoe's research.
>>>> 
>>>> That said, any algorithm using any mechanism by definition needs to
>>>> specify any variables it uses - Codel, for example, tries to keep a
>>>> queue at 5 ms or less, and cuts in after a queue fails to empty for a
>>>> period of 100 ms. I don't see a good argument for saying "but an
>>>> ECN-based algorithm doesn't need to define its thresholds or
>>>> algorithms". Also, as I recall, the MAY in the text came from the
>>>> fact that Bob seemed to think there was value in it (which BTW I
>>>> agree with). To my mind, SHOULD and MUST are strong words, but absent
>>>> such an assertion, an implementation MAY do just about anything that
>>>> comes to the implementor's mind. So saying an implementation MAY <do
>>>> something> is mostly a suggestion that an implementor SHOULD think
>>>> about it. Are we to say that an implementor, given Bob's research,
>>>> should NOT think about giving folks the option?
>>>> 
>>>> I also don't think Elwyn's argument quite follows. When I say that an
>>>> algorithm should auto-tune, I'm not saying that it should not have
>>>> knobs; I'm saying that the default values of those knobs should be
>>>> adequate for the vast majority of use cases. I'm also not saying that
>>>> there should be exactly one initial default; I could easily imagine
>>>> an implementation noting the bit rate of an interface and the ping
>>>> RTT to a peer and pulling its initial configuration out of a table.
>> 
>>> That would be at least partially acceptable as a mode of operation.  But
>>> you might have a "warm-up" issue - would it work OK while the algorithm
>>> was working out what the RTT actually was?  And would the algorithms
>>> adapt autonomously (i.e., auto-tune) to close in on optimum values after
>>> picking initial values from the table?
>> 
>> Again, we have quite a bit of testing experience there, and the places the
>> "warm-up" issue comes in are also the cases where RTT is large. For
>> reasons 100% unrelated to this (it was a comment on ISOC's so-called
>> Measurement Project), yesterday I sent 10 pings, using IPv4 and IPv6 if
>> possible, to the top 1000 sites in the Alexa list. Why 301 of them didn't
>> respond, I don't know; I suspect it has something to do with the name in
>> the Alexa list, like dont-go-here.com. But:
>> 
>> 699 reachable by IPv4
>> median average RTT(IPv4): 0.064000
>> average loss (IPv4): 0.486409
>> average minimum RTT(IPv4): 111.923004
>> average RTT(IPv4): 116.361664
>> average maximum  RTT(IPv4): 124.621014
>> average standard deviation(IPv4): 4.043735
>> 
>> 118 reachable by IPv6
>> median average RTT(IPv6): 28.748000
>> average loss (IPv6): 0.000000
>> average minimum RTT(IPv6): 67.267008
>> average RTT(IPv6): 69.580873
>> average maximum  RTT(IPv6): 74.213856
>> average standard deviation(IPv6): 2.133907
>> 
>> On a 100 ms timescale, the algorithms we're discussing get themselves
>> sorted out.
>> 
>>>>>> s7:  There is an arguable privacy concern that if schemes are
>>>>>> able to identify class 2 or class 3 flows, then a core device can
>>>>>> extract privacy related info from the identified flows.
>>>>>> 
>>>>> GF: I don't see how traffic profiles expose privacy concerns, sure
>>>>> users and apps can be characterised by patterns of interaction -
>>>>> but this isn't what is being talked about here.
>>>> 
>>>> Agreed. If the reference is to RFC 6973, I don't see a violation of
>>>> https://tools.ietf.org/html/rfc6973#section-7. I would if we appeared
>>>> to be inviting stateful inspection algorithms. To give an example of
>>>> how difficult sessions are managed, RFC 6057 uses the CTS message in
>>>> round-robin fashion to push back on top-talker users in order to
>>>> enable the service provider to give consistent service to all of his
>>>> subscribers when a few are behaving in a manner that might prevent
>>>> him from doing so. Note that the "session", in that case, is not a
>>>> single TCP session, but a bittorrent-or-whatever server engaged in
>>>> sessions to tens or hundreds of peers. The fact that a few users
>>>> receive some pushback doesn't reveal the identities of those users.
>>>> I'd need to hear the substance behind Elwyn's concern before I could
>>>> write anything.
>>> 
>>> My reaction was that if your algorithm identifies flows then you have
>>> [big chunk of text that I think was there by mistake dropped out]
>>> potentially helped a bad actor to pick off such flows or get to know who
>>> is communicating in a situation that currently it would be very
>>> difficult to know as the queueing is basically flow agnostic.  OK this
>>> fairly way out but we have seen some pretty serious stuff apparently
>>> being done around core routers according to Snowden et al.
>> 
>> Again, the algorithms I'm aware of are not sensitive to the user or
>> his/her address; they are sensitive to his/her behavior. But again, unless
>> we're talking about fq_*, we're not identifying flows, and with fq_* we're
>> only identifying them in the sense of WFQ. If, to pick something out of
>> the air, PIE is dropping every 100th packet with some probability, and the
>> traffic has 1000 "mouse" flows competing with a single "elephant", I have
>> a 1000:1 probability of hitting the elephant. That's not because I don't
>> like the user or have singled him out in some way; it's because he's
>> sending the vast majority of the traffic. So I don't see the issue you're
>> driving at.
>> 
> GF: I don't know what to change (if any).
> ---
>>>>> s4.7, para 3:
>>>>>> the use of Map/Reduce applications in data centers I think this
>>>>>> needs a reference or a brief explanation.
>>>>> GF: Fred do you know a reference or can suggest extra text?
>>>> 
>>>> The concern has to do with incast, which is a pretty active research
>>>> area (http://lmgtfy.com/?q=research+incast). The paragraph asks a
>>>> question, which is whether the common taxonomy of network flows (mice
>>>> vs elephants) needs to be extended to include references to herds of
>>>> mice traveling together, with the result that congestion control
>>>> algorithms designed under the assumption that a heavy data flow
>>>> contains an elephant merely introduce head-of-line blocking in short
>>>> flows. The word "lemmings" is mine.
>>>> 
>>>> I know of at least four papers (Microsoft Research, CAIA, Tsinghua,
>>>> and KAIST) submitted to various journals in 2014 on the topic. It's
>>>> also, at least in part, the basis for the DCLC RG. The only ones we
>>>> could reference, among those, would relate to DCTCP, as the rest have
>>>> not yet been published.
>>>> 
>>>> Again, I'd like to understand the underlying issue. I doubt that it
>>>> is that Elwyn doesn't like the question as such. Is it that he's
>>>> looking for the word "incast" to replace "map/reduce"?
>>> 
>>> I was just looking for somebody to define the jargon - As far as I am
>>> concerned at this moment "incast" would be just as "bad" since it would
>>> produce an equally blank stare followed by a grab for Google.
>> 
>> If a researcher had that reaction, his or her research wasn't particularly
>> relevant to data center operations. People working with Hadoop or other
>> such applications are familiar with this in detail.
>> 
>> Imagine that I walked into Times Square on the evening of 31 December, and
>> using a bull-horn asked everyone present to simultaneously lift their
>> bull-horns and tell me how they were feeling. Imagine that they did. You
>> now know what incast is and why it's a problem. That's what map/reduce
>> applications do - they simultaneously open or use sessions to thousands of
>> neighboring computers, expecting a reply from each.
>> 
>> But fine. You mentioned Google, so I asked Google about "data center
>> incast". I first asked just about "incast". Merriam-Webster told me that
>> it was in their unabridged dictionary but not the free dictionary. But
>> asking about data center incast, I got pages and pages of papers on the
>> topic.
>> 
>> Gorry, one that might be worth pointing to would be
>> http://www.academia.edu/2160335/A_Survey_on_TCP_Incast_in_Data_Center_Networks
>> 
> I suggest:
> 
> "        <t>Traffic patterns can depend on the network deployment
>        scenario, and Internet research therefore needs to consider
>        the implications of a diverse range of application interactions.
>        At the time of writing (in 2015), an obvious example of further
>        research is the need to consider the many-to-one communication
>        patterns found in data centers, known as incast <xref
>        target="REN12"></xref>, (e.g. produced by
>        Map/Reduce applications). Research also needs to consider the need
>        to extend our taxonomy of
>        transport sessions to include not only "mice" and "elephants", but
>        "lemmings"? Where "Lemmings" are flash crowds of "mice" that the
> network
>        inadvertently tries to signal to as if they were elephant flows,
>        resulting in head of line blocking in data center applications.</t>"
> 
> ----
>>>>> --- The edits below have been incorporated in the XML for  v-09
>>>>> ---
>>>>>> Nits/editorial comments: General: s/e.g./e.g.,/, s/i.e./i.e.,/
>>>>>> 
>>>>>> s1.2, para 2(?) - top of p4: s/and often necessary/and is often
>>>>>> necessary/ s1.2, para 3: s/a > class of technologies that/a class
>>>>>> of technologies that/
>>>>>> 
>>>>>> s2, first bullet 3: s/Large burst of packets/Large bursts of
>>>>>> packets/
>>>>>> 
>>>>>> s2, last para: Probably need to expand POP, IMAP and RDP; maybe
>>>>>> provide refs??
>>>>>> 
>>>>>> s2.1, last para: s/open a large numbers of short TCP flows/may
>>>>>> open a large number of short duration TCP flows/
>>>>>> 
>>>>>> s4, last para: s/experience occasional issues that need
>>>>>> moderation./can experience occasional issues that warrant
>>>>>> mitigation./
>>>>>> 
>>>>>> s4.2, para 6, last sentence: s/similarly react/react similarly/
>>>>>> 
>>>>>> s4.2.1, para 1: s/using AQM to decider when/using AQM to decide
>>>>>> when/
>>>>>> 
>>>>>> s4.7, para 3:
>>>>>>> In 2013,
>>>>>> "At the time of writing" ?
>>>>>> 
>>>>>> s4.7, para 3:
>>>>>>> the use of Map/Reduce applications in data centers
>>>>>> I think this needs a reference or a brief explanation.
>>