Re: [aqm] Obsoleting RFC 2309

"Fred Baker (fred)" <fred@cisco.com> Wed, 02 July 2014 19:36 UTC

Return-Path: <fred@cisco.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 244621B299E for <aqm@ietfa.amsl.com>; Wed, 2 Jul 2014 12:36:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -115.152
X-Spam-Level:
X-Spam-Status: No, score=-115.152 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.651, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sFpuDtudJHWp for <aqm@ietfa.amsl.com>; Wed, 2 Jul 2014 12:36:13 -0700 (PDT)
Received: from rcdn-iport-2.cisco.com (rcdn-iport-2.cisco.com [173.37.86.73]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EC8641A03A9 for <aqm@ietf.org>; Wed, 2 Jul 2014 12:36:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=13820; q=dns/txt; s=iport; t=1404329774; x=1405539374; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=GQrWhpiLafI8NP9aCuAoXgJtHHhtPAtQ4hO09CUJ7bA=; b=gHDBinua/qdgG7zPncevjsetplc04U6J9rLGMVZ8Wb9StV6QJ1VM/lRL BAg+FPZff3YwH2urVll/9w+PHudZiEqQ9lExcyAf9/H5eA48toK6Zn1mh UTBOoPtVhv4pWGEpPJ16VRZa9IfcmcVAjmQqsc5BVcydI3efAZ2im2J4A s=;
X-Files: signature.asc : 195
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AiEFALJetFOtJA2N/2dsb2JhbABagw1SWsYdAYEHFnWEAwEBAQMBaAoHEAIBCBguMiUCBA4FDogsCA3HWRMEiWWCGIJDCwUCAU8Hgy2BFgWFaIw0gUOHDpQKg0OBb0E
X-IronPort-AV: E=Sophos;i="5.01,590,1400025600"; d="asc'?scan'208";a="337306949"
Received: from alln-core-8.cisco.com ([173.36.13.141]) by rcdn-iport-2.cisco.com with ESMTP; 02 Jul 2014 19:36:13 +0000
Received: from xhc-rcd-x09.cisco.com (xhc-rcd-x09.cisco.com [173.37.183.83]) by alln-core-8.cisco.com (8.14.5/8.14.5) with ESMTP id s62JaBaL030465 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 2 Jul 2014 19:36:11 GMT
Received: from xmb-rcd-x09.cisco.com ([169.254.9.143]) by xhc-rcd-x09.cisco.com ([173.37.183.83]) with mapi id 14.03.0123.003; Wed, 2 Jul 2014 14:36:11 -0500
From: "Fred Baker (fred)" <fred@cisco.com>
To: "aqm@ietf.org" <aqm@ietf.org>
Thread-Topic: [aqm] Obsoleting RFC 2309
Thread-Index: AQHPlizgckGgt/l6XU+i7KdZ+owQVg==
Date: Wed, 02 Jul 2014 19:36:11 +0000
Message-ID: <E77D06A7-D0F9-435D-ADD8-FA6082AD994D@cisco.com>
References: <53B327C0.6020407@mti-systems.com> <528D6422-868A-4CEC-AE7A-8B0C3E78EE77@cisco.com>
In-Reply-To: <528D6422-868A-4CEC-AE7A-8B0C3E78EE77@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-originating-ip: [10.19.64.116]
Content-Type: multipart/signed; boundary="Apple-Mail=_0C145BBC-2077-49D7-AF54-97B5DD2E490B"; protocol="application/pgp-signature"; micalg="pgp-sha1"
MIME-Version: 1.0
Archived-At: http://mailarchive.ietf.org/arch/msg/aqm/vrrMnX36vywCq9DNBdhokt6w7BA
Cc: Wesley Eddy <wes@mti-systems.com>, John Leslie <john@jlc.net>
Subject: Re: [aqm] Obsoleting RFC 2309
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Jul 2014 19:36:16 -0000

On Jul 1, 2014, at 5:24 PM, Fred Baker (fred) <fred@cisco.com> wrote:

> 
> On Jul 1, 2014, at 2:27 PM, Wesley Eddy <wes@mti-systems.com> wrote:
>> John Leslie noticed that some of the things Bob Briscoe had
>> mentioned stem from trying to work from RFC 2309 as the starting
>> point.  We have been planning to Obsolete and replace 2309 with
>> this document.  John suggested instead to let it live on, and
>> have this new one only Update it, and has suggested specific
>> changes that could be edited in, if this were the case.
>> 
>> I think we need to make a conscious on-list decision about this,
>> and decide to either confirm that Obsoleting 2309 is correct, or
>> to change course.
>> 
>> Others can amplify or correct these, but I think the points for
>> each would be:
>> 
>> Obsoleting 2309
>> - 2309 was an IRTF document from a closed RG, and we now can make
>> a stronger statement as an IETF group with a BCP
>> - 2309 is a bit RED-centric, and we now think that people should
>> be looking at things other than RED
>> 
>> Not-Obsoleting 2309 (e.g. Updating 2309)
>> - 2309 is a snapshot in history of the E2E RG's thinking
>> - 2309 is mostly oriented towards AQM as a mitigation for congestion
>> collapse, whereas now we're more interested in reducing latency
>> 
>> Please share any thoughts you have on this, and what should be done.
> 
> I’ll give you my view, which I just gave you privately.
> 
> The changes that have been requested include at least:
>    - remove the word “RED” from the document. The operators find RED difficult to use and as a result don’t turn it on. They would like alternative algorithm(s) that require at most minimal parameterization, ideally none at all.
>    - add ECN
>    - add scheduling, which 2309 explicitly didn’t address
>    - update references
> 
> We also have discussed additional recommendations beyond “everyone deploy RED” and “we need more research”.
> 
> If you count affected lines in the document, just removing the word “RED” affects 154 of the 955 lines in the document. Then we go into the rest of the changes. I’m not sure in what way that can be described as an “update”.

Gorry, Wesley, Richard, John, and I have been having a fairly lively conversation in private mail. Let me pull out of it a few comments I have made. John can make his points, and others can make theirs. Note that the following are not, properly speaking, forwarded email; they are taken from email exchanges, but edited to make them make sense in the present context.

--------------------------------------------------------------------------------
As I said yesterday, I am scratching my head on what it means to "obsolete" or "update" a document, and how that might relate to this note.

To my small mind, if I have a specification that updates another specification, I have to read the older specification, and then I have to read the newer one and do what it says in the part of the older that it updates. If one specification has obsoleted another, the older specification is only of historical interest.

When I said that in private email, John replied "That is correct -- iff you are setting out to implement the older spec."

The only reason I would have to not implement the older spec is if it were obsolete. It is was merely updated, I am by definition implementing the older spec, with some changes.

Let me give you an example. If I decide to implement TCP, I start with

https://tools.ietf.org/html/rfc0793
0793 Transmission Control Protocol. J. Postel. September 1981.
    (Format: TXT=172710 bytes) (Obsoletes RFC0761) (Updated by RFC1122,
    RFC3168, RFC6093, RFC6528) (Also STD0007) (Status: INTERNET STANDARD)

and also read 1122, 3168, 6093, and 6528. Those clean up some basic stuff, add ECN, change Urgent, and add a variant on a SYN Cookie. If I don't do those, I haven't got a correct implementation of TCP. But I was setting out to implement the protocol defined in RFC 793, TCP. Those documents updated 793.

However, if I decide to implement 

https://tools.ietf.org/html/rfc5681
5681 TCP Congestion Control. M. Allman, V. Paxson, E. Blanton.
    September 2009. (Format: TXT=44339 bytes) (Obsoletes RFC2581)
    (Status: DRAFT STANDARD)

I don't have to read RFC 2581. Everything that was in 2581 that I need to know will be found in 5681. I'm only implementing 5681, not 2581. I might read it for interest's sake, or as a historian, but as an implementor it is unnecessary.
--------------------------------------------------------------------------------
If I understand him correctly, one of the things John is pushing back on is obsoleting RED as an algorithm; his view, if I understand it, is that if you can figure out what parameters to give RED, it works just fine. On that point, I would generally agree; in my experience, RED has two important parameters, which are min-threshold and max-threshold. The mean latency through a queue will generally be governed by and approximate min-threshold, so one wants to set it just deep enough to accomplish ones latency goals, and max-threshold is derived from the underlying hardware. 

However, the operational commentary on RED in twvarea, tsvwg, and over time has been that "just deep enough to accomplish ones latency goals" is fuzzy enough to be difficult to use. CoDel was developed, in part, to make the latency goal an explicit algorithmic factor; if part of the delay in a transmission system is, for example, channel acquisition delay (think about the behavior of busy conference WiFi systems), it would be nice to have that factored into the calculation as well as actual queue depth (which is corollary to but not necessarily indicative of latency). Other algorithms, of which PIE is an example, work with queue depth as a corollary to latency, and adjust their understanding of the relationship dynamically.

Gorry and I have separately commented, in that private conversation, that obsoleting RFC 2309 doesn't obsolete RED as an algorithm, nor does it say that RED is a bad algorithm. It replaces the recommendation of 2309, which is that 

   o    RECOMMENDATION 1:

        Internet routers should implement some active queue management
        mechanism to manage queue lengths, reduce end-to-end latency,
        reduce packet dropping, and avoid lock-out phenomena within the
        Internet.

        The default mechanism for managing queue lengths to meet these
        goals in FIFO queues is Random Early Detection (RED) [RED93].
        Unless a developer has reasons to provide another equivalent
        mechanism, we recommend that RED be used.
        
16 years following the publication of RFC 2309, it turns out that we have issues, but the issues are not the management of queue lengths, reduction of packet dropping, and avoidance of lock-out phenomena within the Internet. RED, by the way, never did reduce packet loss; it drops the same number of packets, but attempts to desynchronize them and achieve a certain goal using them. What we *are* trying to do is reduce queuing delay and by extension that component of end to end latency.

And it turns out that there are now better algorithms than RED.

I would go one step further here, and whether this comment belongs in the draft or not is not important to me. There are two reasons one might standardize something. The most common one, and the one that motivates most IETF work, is interoperability of separate implementations that have to communicate. OSPF and IS-IS, for example, need to be specified down to the gnat's eyelash, including the "alternate Tuesday rules" that make different implementations make the same choice when more than one choice could reasonably and validly be made, to guarantee interoperability. That consideration doesn't apply here; TCP will interpret signals from the network including ECN and packet loss even if the network doesn't realize that they are signals. The only requirement that could be described using the term "interoperability" is that the TCP/SCTP/whatever receiving the signals should as a result do the right thing. The other is so that everyone agrees on what an algorithm does, why it does it, and where it should be implemented. BCP 38, for example, can be implemented in several ways, but because it is standardized as a recommendation, we know what we are trying to achieve when we implement it.

What we are obsoleting, at minimum, is the recommendation that RED be the default algorithm. What we instead recommend is that each implementation implement some form of AQM to achieve certain purposes. That statement alone could be an update to 2309 if 2309 made its recommendation regarding RED in one place. However, it is throughout the document. Hence, a change to 2309's statement regarding RED is a pervasive change affecting every section of it.
--------------------------------------------------------------------------------
The other thing I think John is trying to preserve - and John, I'm looking for your correction here - is 2309's report on research. There has been ongoing research since 2309 was written, but a large portion of the relevant research predates and is reported in 2309. So we have been asked for an updated bibliography in this document, and the updates haven't been as large as one might expect from an active research field. I think John would like to continue to point to 2309's description of the motivating and underlying research.

On that, I go two ways. First, as a historical review of the research the original recommendation was based on, 2309 is unsurpassed. Second, I think we do need to motivate the updated recommendations we make, which include for example the implementation of ECN. For that reason, we need to point to at least some research. We have been asked, by the proponents of fq_codel, to also mention scheduling. 2309 left that out and said why it left that out. draft-baker-aqm-sfq-implementation is my further reflections on the interaction between scheduling, which has to do with the distribution of service across competing sessions, and AQM, which has to do with latency; they two don't necessarily get along as well with each other as one might expect. In this document, we have tried to say that we don't have a problem with scheduling implementations, but we think AQM might be applied differently to different traffic classes in the same queuing system. draft-geib-tsvwg-diffserv-intercon, for example, contemplates four traffic classes, one of which is engineered to RFCs 3246/3247, one of which allows for a shallow queue without AQM for UDP-based applications, one of which allows for a deeper queue with AQM for "preferred" elastic traffic, and one of which is simply the default class, which one might expect AQM in but would bear the brunt of traffic loss should it occur. 

To my mind, we're going quite a bit further than "just remove RED"; we are doing things 2309 said it didn't want to do, and adding capabilities that 2309 didn't consider.
--------------------------------------------------------------------------------

In any event, RFC 2309 remains in the historical record, which is to say the RFC series. It's a great document. But when we send someone to figure out what to implement in an AQM implementation, I don't think we are going to send them there. This document, plus the set of algorithms that we wind up recommending,are a complete replacement. So I don't see it as an update. I see it as making 2309 "historical" or "obsoleted by" this document.

--------------------------------------------------------------------------------

Which brings me back to John's proposed text. I'm frankly curious what the working group's thought on that is. If the working group wants to replace the existing sections 1-3 with that or an updated version of that, so be it. I'm willing to see John added as a co-author or co-editor and the text dropped in. If the working group prefers the existing text, I'm OK with that. If there are glaring holes, they need to be filled in.

What I'm not OK with is a continuation of the current dinking with the document. I don't think we have dramatically improved the document since last November, although we have done a lot of work on it to respond to working group comments. It feels to me like we, as a working group, have lost our way. From this point on, *I* think we need to ask ourselves whether any given change we make materially improves the document or merely dinks with the text. And we should limit changes to material improvements.