[TERNLI] History of signalling changes in congestion up the layers
Bob Briscoe <rbriscoe@jungle.bt.co.uk> Wed, 02 August 2006 18:52 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1G8LpV-0001IY-Ja; Wed, 02 Aug 2006 14:52:29 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1G8LpT-0001EX-Sl for ternli@ietf.org; Wed, 02 Aug 2006 14:52:27 -0400
Received: from smtp4.smtp.bt.com ([217.32.164.151]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1G8LpS-0005Oh-2n for ternli@ietf.org; Wed, 02 Aug 2006 14:52:27 -0400
Received: from i2kc06-ukbr.domain1.systemhost.net ([193.113.197.70]) by smtp4.smtp.bt.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 2 Aug 2006 19:52:24 +0100
Received: from cbibipnt05.iuser.iroot.adidom.com ([147.149.196.177]) by i2kc06-ukbr.domain1.systemhost.net with Microsoft SMTPSVC(6.0.3790.211); Wed, 2 Aug 2006 19:52:24 +0100
Received: From bagheera.jungle.bt.co.uk ([132.146.168.158]) by cbibipnt05.iuser.iroot.adidom.com (WebShield SMTP v4.5 MR1a P0803.399); id 11545447446; Wed, 2 Aug 2006 19:52:24 +0100
Received: from mut.jungle.bt.co.uk ([10.215.130.80]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id k72IqKm6012958 for <ternli@ietf.org>; Wed, 2 Aug 2006 19:52:23 +0100
Message-Id: <5.2.1.1.2.20060802190628.02975a28@pop3.jungle.bt.co.uk>
X-Sender: rbriscoe@pop3.jungle.bt.co.uk
X-Mailer: QUALCOMM Windows Eudora Version 5.2.1
Date: Wed, 02 Aug 2006 19:52:03 +0100
To: ternli@ietf.org
From: Bob Briscoe <rbriscoe@jungle.bt.co.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -1.201 () ALL_TRUSTED,MIME_QP_LONG_LINE
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
X-OriginalArrivalTime: 02 Aug 2006 18:52:24.0687 (UTC) FILETIME=[CB1677F0:01C6B664]
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 995b2e24d23b953c94bac5288c432399
Subject: [TERNLI] History of signalling changes in congestion up the layers
X-BeenThere: ternli@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Transport-Enhancing Refinements to the Network Layer Interface <ternli.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ternli>, <mailto:ternli-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ternli>
List-Post: <mailto:ternli@ietf.org>
List-Help: <mailto:ternli-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ternli>, <mailto:ternli-request@ietf.org?subject=subscribe>
Errors-To: ternli-bounces@ietf.org
Folks, I missed the ad hoc TERNLI BoF, but I've just read the jabber session, and noticed there was a desire to dig up history on this being done before. The text below might be useful. It: i) gives a history of ICMP source quench (SQ) falling in and out of favour; ii) gives a list of problems found with SQ. It's an extract from one of my "reviews longer than the original paper" written to help the author of a conference submission to understand the problems there would be introducing their proposal for a SQ-like interaction model (router - source). It was written in Aug 2000, so it's history about history now. BTW, in their case, it was for multicast, so it made a bit more sense than sending congestion notification to all receivers then having to suppress all but one of the responses, but still it had a multi-bottleneck implosion problem, and DoS vulnerability... /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ The earliest ref I can find on source quench/explicit congestion notification is the RFC famous for introducing the Nagle algorithm, but also including discussion on use of Source Quench for congestion avoidance or recovery (about 2/3 the way through Nagles RFC896 Jan 84 [15]). This led to the use of ICMP Source Quench as the mandatory IETF approach to congestion control for a short while (Section 2.2.3 of Postels router requirements RFC1009 in Jun 87 [4]). Even at that time, RFC 1009 allowed active queue management for congestion avoidance rather than recovery. However, it was already admitted in that RFC that SQ wasnt the ideal solution and research was continuing. The arguments against use of Source Quench that led to the change of gateway requirements are summarised in RFC1254 (section 3.1, Aug 1991 [14]), which gives an excellent set of further references. The arguments seem to have been more a result of an unfortunate sequence of events. Essentially, there were so many different algorithms for sending source quench that it wasnt clear what a source should assume was happening when it got one - congestion onset or a router had actually run out of buffer. Packet drop, on the other hand, was a clearer indication that resources had run out. By Nov 1994 source quench was a definite SHOULD NOT in the draft router requirements RFC (Almquists RFC1716 [1]). The alternative approach to congestion control was deliberately vague due to ongoing research, but always involved drop of packets in some form - outlined in Section 5.3.6, referring to papers of this time such as [13, 6, 16, 9]. SQ was described as a weak mechanism, perhaps because of the above arguments, but also perhaps because it was generally only signalled statistically to avoid congestion avalanche. Use of explicit congestion notification first appeared in [10] in the DEC DNA protocol. Other more solid reasons against the use of SQ do exist: SQ violates layering (congestion control is assumed to be end to end in the Internet Architecture), so it can interact badly with IPsec encryption and tunneling. If a router decides to send an SQ message in response to a packet, it is meant to include the first 64b of the original packet as an ICMP payload. When this arrives at the sending IP stack, to work out which socket to pass it to, for most protocol types it can work out which port the original packet came from, which it finds from the original header that has been returned to it. With IPsec it can use the SPI field [3]. But if IPsec was in tunnel mode, the tunnel ingress hasn't got enough information to find out the original source in order to forward on (backward on?) the SQ message. ICMP packet creation is normally (always) implemented on the ingress interface of a router. Congestion is detected on the egress, where it is too late, and too expensive wrt. critical processing time, to trigger the creation of an ICMP packet, without hugely re-working the implementation architecture of most routers. SQ creates more data (although admittedly in the other direction) during congestion, which is generally to be avoided. SQ may get lost, and theres no end-middle reliable channel between source and bottleneck to recover losses. SQ messages can be sent to hosts as if they came from an on-path router, thus creating a DoS vulnerability. References [1] P. Almquist and F. Kastenholz. Towards requirements for IP routers. Request for comments 1716, Internet Engineering Task Force, URL: rfc1716.txt, November 1994. (Obsoleted by RFC1812) (Status: informational). [2] Hari Balakrishnan, Hariharan Rahul, and Srinivasan Seshan. An integrated congestion management architecture for Internet hosts. Proc. ACM SIGCOMM99,Computer Communication Review, 29(4), October 1999. [3] L. Berger and T. OMalley. RSVP extensions for IPSEC data flows. Request for comments 2207, Internet Engineering Task Force, URL: rfc2207.txt, September 1997. [4] R. T. Braden and J. Postel. Requirements for internet gateways. Request for comments 1009, Internet Engineering Task Force, URL: rfc1009.txt, June 1987. (Obsoleted by RFC1812) (Status: historic). [5] A. Dracinschi and Serge Fdida. Congestion avoidance for unicast and multicast trac. In Proc. 1st IEEE Conference on Universal Multiservice Networks (ECUMN 2000), Colmar, France, pages 360368, URL: http://www-rp.lip6.fr/publications/files/anca/e_ecumn_linux.ps.gz, October 2000. [6] G. Finn. A connectionless congestion control algorithm. ACM SIGCOMM Computer Communication Review, 19(5), October 1989. [7] M. Fuchs, C. Diot, T. Turletti, and M. Homan. A naming approach for ALF design. In Proc. HIPPARCH98 workshop, London, URL: ftp://ftp.sprintlabs.com/diot/naming-hipparch.ps.gz, June 1998. [8] Robert Graham. FAQ: Firewall forensics. Web page URL: http://www.robertgraham.com/pubs/ firewall-seen.html\#2.4, February 2001. [9] Van Jacobsen. Congestion avoidance and control. Proc. ACM SIGCOMM88, Computer Communication Review, 18(4):314329, 1988. [10] R. Jain, K. Ramakrishnan, and D. Chiu. Congestion avoidance in computer networks with a connectionless network layer. Technical report DEC-TR-506, Digital Equipment Corporation, 1987. [11] Tae Eun Kim, Raghupathy Sivakumar, Kang-Won Lee, and Vaduvur Bharghavan. Multicast service dierentiation in core-stateless networks. In Proc. 1st International COST264 Workshop on Networked Group Communication (NGC99), volume 1736. Springer LNCS, November 1999. 5 [12] Dong Lin and Robert Morris. Dynamics of random early detection. Proc. ACM SIGCOMM97, Computer Communication Review, 27(4), October 1997. [13] A. Mankin, G. Hollingsworth, G. Reichlen, K. Thompson, R. Wilder, and R. Zahavi. Evaluation of Internet performance FY89. Technical report MTR-89W00216, MITRE Corporation, February 1990. [14] A. Mankin and K. Ramakrishnan. Gateway congestion control survey. Request for comments 1254, Internet Engineering Task Force, URL: rfc1254.txt, July 1991. (Status: informational). [15] J. Nagle. Congestion control in IP/TCP internetworks. Request for comments 896, Internet Engineering Task Force, URL: rfc896.txt, January 1984. (Status: unknown). [16] J. Nagle. On packet switches with infinite storage. Request for comments 970, Internet Engineering Task Force, URL: rfc970.txt, December 1985. (Status: unknown). ____________________________________________________________________________ Bob Briscoe, <bob.briscoe@bt.com> Networks Research Centre, BT Research B54/77 Adastral Park,Martlesham Heath,Ipswich,IP5 3RE,UK. +44 1473 645196