Re: [tcpm] Review of draft-ietf-tcpm-alternativebackoff-ecn-02
Naeem Khademi <naeemk@ifi.uio.no> Wed, 15 November 2017 11:54 UTC
Return-Path: <naeemk@ifi.uio.no>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 13082126C0F for <tcpm@ietfa.amsl.com>; Wed, 15 Nov 2017 03:54:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mHa9mbgE0zTP for <tcpm@ietfa.amsl.com>; Wed, 15 Nov 2017 03:54:04 -0800 (PST)
Received: from mail-out02.uio.no (mail-out02.uio.no [IPv6:2001:700:100:8210::71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D163312945F for <tcpm@ietf.org>; Wed, 15 Nov 2017 03:54:02 -0800 (PST)
Received: from mail-mx02.uio.no ([129.240.10.43]) by mail-out02.uio.no with esmtp (Exim 4.82_1-5b7a7c0-XX) (envelope-from <naeemk@ifi.uio.no>) id 1eEwGX-000FMW-8n; Wed, 15 Nov 2017 12:54:01 +0100
Received: from mail-ex12.exprod.uio.no ([129.240.120.74]) by mail-mx02.uio.no with esmtps (TLSv1.2:AES256-SHA:256) (Exim 4.82_1-5b7a7c0-XX) (envelope-from <naeemk@ifi.uio.no>) id 1eEwGU-0007KC-VF; Wed, 15 Nov 2017 12:54:01 +0100
Received: from mail-ex02.exprod.uio.no (2001:700:100:52::5) by mail-ex12.exprod.uio.no (2001:700:100:120::74) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Wed, 15 Nov 2017 12:53:58 +0100
Received: from mail-ex02.exprod.uio.no ([fe80::114e:f446:7d7d:a2f]) by mail-ex02.exprod.uio.no ([fe80::114e:f446:7d7d:a2f%19]) with mapi id 15.00.1347.000; Wed, 15 Nov 2017 12:53:58 +0100
From: Naeem Khademi <naeemk@ifi.uio.no>
To: "Bless, Roland (TM)" <roland.bless@kit.edu>
CC: "tcpm@ietf.org Extensions" <tcpm@ietf.org>, Michael Welzl <michawe@ifi.uio.no>, Gorry Fairhurst <gorry@erg.abdn.ac.uk>, grenville armitage <garmitage@swin.edu.au>
Thread-Topic: Review of draft-ietf-tcpm-alternativebackoff-ecn-02
Thread-Index: AQHTS+d0BLEXnPhOQ0SyTgbWWxRRQqMVaD0A
Date: Wed, 15 Nov 2017 11:53:57 +0000
Message-ID: <7447FBC9-6B81-4A97-AB45-C57555B30559@ifi.uio.no>
References: <bd5142c3-6ea9-f703-4a57-78ccb3679574@kit.edu>
In-Reply-To: <bd5142c3-6ea9-f703-4a57-78ccb3679574@kit.edu>
Accept-Language: en-GB, nb-NO, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-mailer: Apple Mail (2.3273)
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [129.240.169.59]
Content-Type: multipart/alternative; boundary="_000_7447FBC96B814A97AB45C57555B30559ifiuiono_"
MIME-Version: 1.0
X-UiO-SPF-Received: Received-SPF: neutral (mail-mx02.uio.no: 129.240.120.74 is neither permitted nor denied by domain of ifi.uio.no) client-ip=129.240.120.74; envelope-from=naeemk@ifi.uio.no; helo=mail-ex12.exprod.uio.no;
X-UiO-Spam-info: not spam, SpamAssassin (score=-0.5, required=5.0, autolearn=disabled, AWL=-1.150, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652, uiobl=NO, uiouri=NO)
X-UiO-Scanned: 98B1B5FF2F928A9B00414DA428A885DCC1F1FD0B
X-UiOonly: DF74DDEE15B4A5D8FFF0C30B73224B48B925EB27
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/OaZzOlcOUgsQGYVSMWVdOUuRzNA>
Subject: Re: [tcpm] Review of draft-ietf-tcpm-alternativebackoff-ecn-02
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Nov 2017 11:54:08 -0000
Hi Roland Thanks a lot for the comments. Almost all of your comments are now addressed in -03 (submitted) and -04 (un-submitted but attached to the email I sent in response to L. Stewart’s comments). However, here are per-item responses. Please see inline: On Oct 23, 2017, at 6:12 PM, Bless, Roland (TM) <roland.bless@kit.edu<mailto:roland.bless@kit.edu>> wrote: Hi, as promised at the last IETF meeting, here is my (lengthy) review of draft-ietf-tcpm-alternativebackoff-ecn-02. In summary: 1) more precise terminology, e.g., better distinction between buffer and queue, also use same terminology as in RFC 5681 (largely done in -02 already). 2) state more clearly that the concrete recommendations are tailored to specific congestion controls and be more open to other congestion control variants 3) the abstract is too long and section 4 contains a bit redundancy w.r.t. sections 2 and 3. I have provided answers to the above items in the “in more detail” part below: in more detail 1) I would prefer to use _buffer_ for the maximum memory space that is allocated for potentially enqueued packets and _queue_ for the amount of actually queued packets, i.e., current buffer occupancy. Therefore, IMHO it makes sense to speak of shallow buffers and short queues, but not of "shallow queues". In particular, an AQM tries to keep the (longer-term) queue short in a buffer while accepting transient bursts -- the behavior therefore also differs from a "shallow buffer”. Agreed in general; changed all instances of “shallow buffer” to “short queue” both in the abstract and the main body, except for “shallow AQM marking threshold” which remains the same as it refers to the “shallow threshold”. 2) the main point of this draft is: it makes sense to behave different to CE-ECN marked packets than to packet loss. One benefit is to achieve higher utilization by adjusting the backoff to be less. The recommendation for two backoff factors is specific for two congestion controls, CUBIC and New Reno. For other congestion controls it may also make sense to adapt differently, but the draft doesn't provide any recommendations for them. In general, there can be lots of different congestion controls that do not need this kind of modification to keep the utilization high. This seems to have been already captured in Section 4.3 (below): beta_{ecn} depends on how the response of a TCP connection to shallow AQM marking thresholds is optimised. beta_{loss} reflects the preferred response of each congestion control algorithm when faced with exhaustion of buffers (of unknown depth) signalled by packet loss. Consequently, for any given TCP congestion control algorithm the choice of beta_{ecn} is likely to be algorithm-specific, rather than a constant multiple of the algorithm's existing beta_{loss}. So I’m not sure if we need to add anything more beyond what’s discussed in here without risking being redundant. If you think otherwise, please suggest text (that differs from above) and a suitable (sub-)section. I have also added this text to Section 4.3: The recommended beta_{ecn} value in this document is only applicable for Standard TCP congestion control. 3) The abstract should be corrected according to 1 and shortened, such as: Recent Active Queue Management (AQM) mechanisms allow for burst tolerance while enforcing short queues to minimise the time that packets spend enqueued at a bottleneck. This can cause noticeable performance degradation for TCP connections traversing such a bottleneck, especially if they are only a few or their bandwidth-delay-product is large. An Explicit Congestion Notification (ECN) signal indicates that an AQM mechanism is used at the bottleneck, and therefore the bottleneck network queue is likely to be short. This document therefore proposes an update to the TCP sender-side ECN reaction in congestion avoidance to reduce the congestion window by a smaller amount than the congestion control algorithm's reaction to loss. Simply used the above text suggestion, while keeping the abbreviation definition of cwnd. ------------------ Walk through: Section 2. ========== Research has demonstrated the benefits of reducing network delays due to excessive buffering [BUFFERBLOAT]; this has led to the creation of new AQM mechanisms like PIE [RFC8033] and CoDel [CODEL2012] [I-D.CoDel], which avoid causing the bloated queues that are common with a simple tail-drop behaviour (also known as a First-In First- Out, FIFO, queue). The first sentence is confusingly put: "reducing network delays due to excessive buffering", better rephrase. below Moreover, I'd like to see a more precise description of the problem here: The main problem is here that existing loss-based congestion controls complete fill available bottleneck buffer capacity. So it's primarily _not_ the tail-drop behavior causing bloated queues, but the congestion control. Changed to: Research has demonstrated the benefits of reducing network delays that are caused by interaction of loss-based TCP congestion control and excessive buffering [BUFFERBLOAT]. There exist two approaches to reduce the queues: use different a different congestion control (modify end points) or enforce short queues in routers by using AQMs (modify intermediate systems). So a delay-based congestion control can use a tail-drop FIFO queue and still avoid excessive queuing delays, i.e., not even requiring an AQM to control the queue. We (authors) think that it’s best to leave the discussion on the delay-based CCs outside of this document. Despite the fact that we initially talked about this in -03, we have now removed the text that mentions them (in -04 draft). We would like to avoid detouring into specific mention of "i.e delay-based" approaches and the dismissive "The... suffers from... out of scope...". There's a wide literature on techniques that are based on delay. However, the mention of "delay" based CC distracts from our I-D. Delay-based algos aren't out of scope for our doc due to coexistency problems (as we initially wrote in -03) but they are out of scope because they're completely irrelevant to our proposal by definition. These AQM mechanisms instantiate short queues that are designed to tolerate packet bursts. More precisely: These AQM mechanisms aim to keep a sustained queue short while tolerating transient (short-term) packet bursts. Fixed. However, congestion control mechanisms cannot always utilise a bottleneck link well where there are short queues. => However, currently used loss-based congestion control mechanisms Fixed. to compensate for TCP halving the "cwnd" and "ssthresh" variables in response to a lost packet [RFC5681]. see 1), cwnd is set to FlightSize/2, not cwnd/2 (RFC5681 is quite specific about this). This language (using “halving”) is common throughout RFC3168 (perhaps wrongly). Therefore changed “halving” to “reducing”. Since it already cites RFC5681, it’s clear how it does it. saying halving the “FlightSize” would have been wrong as TCP doesn’t change the flight size variable (it’s measured/calculated), so this seems to be an easy way out. Fixed and reads as: For example, a TCP sender must be able to store at least an end-to-end bandwidth-delay product (BDP) worth of data at the bottleneck buffer if it is to maintain full path utilisation in the face of loss-induced reduction of cwnd [RFC5681], which effectively doubles the amount of data that can be in flight, the maximum round- trip time (RTT) experience, and the path's effective RTT using the network path. This requires the bottleneck queue to be able to store at least an end-to-end bandwidth-delay queue => buffer Done! product (BDP) of data, which effectively doubles both the amount of data that can be in flight and the round-trip time (RTT) experience using the network path. it effectively doubles the RTT only if the buffer is completely filled, usually the queue is varying over time. Added “maximum” to the “round-trip time (RTT)”. ABE improves the performance when routers use shallow buffered AQM mechanisms. See 1), e.g., "when routers use AQM controlled buffers that allow for short queues only.” Fixed. Section 3. ========== This specification describes an update to the congestion control algorithm of an ECN-capable TCP transport protocol. See 2.) This statement is very generic, whereas the recommendation is quite specific to CUBIC and NewReno. It may be useful for other congestion controls as well if they require also a more moderate response/backoff in order to keep the utilization high. Their backoff modification may however, be different. Moreover, there exist other congestion controls that don't suffer from underutilization if they react to a congestion signal. Actually the recommendation is purely for NewReno, also standing as IETF-standard “TCP congestion control”. We mention that we have tested for CUBIC as well as provide a value where CUBIC works well at, but the RECOMMENDATION given in the I-D (i.e. beta_{ecn}=0.8) only concerns the standard TCP. It RECOMMENDS that a TCP sender multiplies the cwnd by 0.8 and reduces the slow start threshold (ssthresh) in congestion avoidance following reception of a TCP segment that sets the ECN-Echo flag (defined in [RFC3168]). See previous comment: here you should be explicit about the particular congestion controls where the recommended behavior and parameter can be applied to. See above. Moreover, "cwnd= max (FlightSize * beta_{ecn}, 2 * SMSS)", which is a bit different from cwnd= cwnd * beta_{ecn} (this is what the text suggests). Now reads as: It RECOMMENDS that a TCP sender multiplies the slow start threshold (ssthresh) by 0.8 times of the FlightSize (with its minimum value set to 2 * SMSS) and reduces the cwnd in congestion avoidance following reception of a TCP segment that sets the ECN-Echo flag (defined in [RFC3168]). Section 4. ========== performance gains in lightly-multiplexed scenarios, without losing "lightly-multiplexed scenarios" means presumably that only a few flows traverse the considered bottleneck, but how many are "a few" then? three or nine or twenty? Later on it is defined as "lightly-multiplexed case (few concurrent connections)", better mention this at first use already.> Done. loss is detected (regarded as a notification of congestion), Standard TCP halves the cwnd and ssthresh [RFC5681], which causes the TCP ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ See 1), roughly speaking yes, but not quite correct. Moreover, packet loss can be detected by various ways, e.g., by using the retransmission timer or not; the congestion control response may differ then, too. Complying with the description text of RFC5681, it now reads as: When packet loss is inferred using the retransmission timer and the given packet has not yet been resent by way of the retransmission timer (regarded as a notification of congestion), Standard TCP sets the ssthresh to the maximum of half of the FlightSize and 2*SMSS [RFC5681], which causes the TCP congestion control to go back to allowing only a BDP of packets in flight -- just sufficient to maintain 100% utilisation of the bottleneck on the network path. delay target in routers and use congestion notifications to constrain the queuing delays experienced by packets, rather than in response to if AQMs set CE, they hope for an appropriate action, however, they'll limit the queueing delays by actively dropping packets. it uses the term "congestion notifications”, which can be loss (implicit congestion notification) or explicit (ECN). If the sender is unresponsive, then dropping packets is a “protection mechanism”, but the underlying assumption is the whatever traffic is traversing the path is congestion controlled at end-points. that were not necessarily configured to emulate a shallow queue see 1), short queue vs. shallow buffer Changed to "emulate a bottleneck with a short queue" However, it interacts badly for a lightly-multiplexed case (few concurrent connections) over a path with a large BDP. Conventional TCP backoff in such cases leads to gaps in packet transmission and under-utilisation of the path. Maybe combine these two sentences: However, in a lightly-multiplexed case (few concurrent connections) over a path with a large BDP, conventional TCP backoff leads to gaps in packet transmission and under-utilisation of the path. Done. hence the CE-mark likely came from a bottleneck with a shallow queue. controlled and short queue Changed to "controlled short queue". Reacting differently to an ECN CE-mark than to packet loss can then yield the benefit of a reduced back-off, as with CUBIC [I-D.CUBIC], when queues are short, yet it can avoid generating excessive delay when queues are long. I'm not sure that I understood the gist in this statement, better rephrase and split up into two sentences? Now reads as: Reacting differently to an ECN-signalled congestion than to an inferred packet loss can then yield the benefit of a reduced back-off when queues are short. Using ECN can also be advantageous for several other reasons [RFC8087]. For non-ECN-enabled TCP connections, Not fully clear what this means. Are the end-systems ECN capable, but the routers in between do not mark? Or does it mean that at least one end-system isn't ECN capable? ECN-enabled *connection* is the kind of connection in which both end points have successfully negotiated the ECN. non-ECN-enabled connection is the opposite of that. ssthresh_(t+1) = max (FlightSize_t * beta_{loss}, 2 * SMSS) RFC 5681 doesn't use any notation with "t". If you are using t, you should specify what it means. My suggestion is to avoid introducing it and to use the same terminology as in RFC 5681. Done. I think that the beginning of section 4.3 belongs more to section 3 while that rest fits to the section title (discussion of the ABE multiplier). Which sentence do you exactly prefer to be moved to Section 3? Section 5. =========== 5. Status of the Update I don't understand the purpose of this section or the section title is weird at least. Is it meant to describe required changes? The use this as section title. it’s the "status of the update to the congestion control” that is being proposed in this document. It addresses the “Requirement for the update to the congestion control”, but for the sake of brevity I have now changed this to “ABE requirements” congestion-control algorithms, it does not require any change to the everywhere else it is "congestion control" without dash. Fixed. The currently published ECN specification requires that the congestion control response to a CE-marked packet is the same as the response to a dropped packet [RFC3168]. The specification is currently being updated to allow for specifications that do not follow this rule [I-D.ECN-exp]. The present specification defines such an experiment and has thus been assigned an Experimental status before being proposed as a Standards-Track update. This is largely a repetition from the introduction. Unless repetition is bad, it is okay. Introductions can often be treated as places that summarise key messages contained in the body of a document. So reptition is to be expected. Because this advantage applies only to ECN-marked packets and not to loss indications, the new method cannot lead to congestion collapse. I'm not sure that I can follow here. There are several forms of congestion collapse and the classical one causes unnecessary retransmissions by a timer mismatch. Maybe you can elaborate a bit more here. It now reads as: Because this advantage applies only to ECN-marked packets and not to packet loss indications, in the worst-case (e.g., an ABE-compliant TCP sender using beta_{ecn} = 1.0) the ECN-capable bottleneck will still fall back to dropping packets, and the result is no different than if the TCP sender was using traditional loss-based congestion control. Section 8. ========== http://heim.ifi.uio.no/naeemk/research/ABE/ This code was used to Full stop missing here (presumably to avoid problems with the URL). Fixed. Maybe put the (most important) changes into an appendix? I'm not sure how long this URL will be valid after the RFC has been published. We are not yet at WGLC, but will fix this in later revisions. Regards, Roland Cheers, Naeem
- [tcpm] Review of draft-ietf-tcpm-alternativebacko… Bless, Roland (TM)
- Re: [tcpm] Review of draft-ietf-tcpm-alternativeb… Lawrence Stewart
- Re: [tcpm] Review of draft-ietf-tcpm-alternativeb… Naeem Khademi
- Re: [tcpm] Review of draft-ietf-tcpm-alternativeb… Naeem Khademi