[TERNLI] History of signalling changes in congestion up the layers

Bob Briscoe <rbriscoe@jungle.bt.co.uk> Wed, 02 August 2006 18:52 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1G8LpV-0001IY-Ja; Wed, 02 Aug 2006 14:52:29 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1G8LpT-0001EX-Sl for ternli@ietf.org; Wed, 02 Aug 2006 14:52:27 -0400
Received: from smtp4.smtp.bt.com ([217.32.164.151]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1G8LpS-0005Oh-2n for ternli@ietf.org; Wed, 02 Aug 2006 14:52:27 -0400
Received: from i2kc06-ukbr.domain1.systemhost.net ([193.113.197.70]) by smtp4.smtp.bt.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 2 Aug 2006 19:52:24 +0100
Received: from cbibipnt05.iuser.iroot.adidom.com ([147.149.196.177]) by i2kc06-ukbr.domain1.systemhost.net with Microsoft SMTPSVC(6.0.3790.211); Wed, 2 Aug 2006 19:52:24 +0100
Received: From bagheera.jungle.bt.co.uk ([132.146.168.158]) by cbibipnt05.iuser.iroot.adidom.com (WebShield SMTP v4.5 MR1a P0803.399); id 11545447446; Wed, 2 Aug 2006 19:52:24 +0100
Received: from mut.jungle.bt.co.uk ([10.215.130.80]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id k72IqKm6012958 for <ternli@ietf.org>; Wed, 2 Aug 2006 19:52:23 +0100
Message-Id: <5.2.1.1.2.20060802190628.02975a28@pop3.jungle.bt.co.uk>
X-Sender: rbriscoe@pop3.jungle.bt.co.uk
X-Mailer: QUALCOMM Windows Eudora Version 5.2.1
Date: Wed, 02 Aug 2006 19:52:03 +0100
To: ternli@ietf.org
From: Bob Briscoe <rbriscoe@jungle.bt.co.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -1.201 () ALL_TRUSTED,MIME_QP_LONG_LINE
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
X-OriginalArrivalTime: 02 Aug 2006 18:52:24.0687 (UTC) FILETIME=[CB1677F0:01C6B664]
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 995b2e24d23b953c94bac5288c432399
Subject: [TERNLI] History of signalling changes in congestion up the layers
X-BeenThere: ternli@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Transport-Enhancing Refinements to the Network Layer Interface <ternli.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ternli>, <mailto:ternli-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ternli>
List-Post: <mailto:ternli@ietf.org>
List-Help: <mailto:ternli-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ternli>, <mailto:ternli-request@ietf.org?subject=subscribe>
Errors-To: ternli-bounces@ietf.org

Folks,

I missed the ad hoc TERNLI BoF, but I've just read the jabber session, and 
noticed there was a desire to dig up history on this being done before. The 
text below might be useful. It:
i) gives a history of ICMP source quench (SQ) falling in and out of favour;
ii) gives a list of problems found with SQ.

It's an extract from one of my "reviews longer than the original paper" 
written to help the author of a conference submission to understand the 
problems there would be introducing their proposal for a SQ-like 
interaction model (router - source). It was written in Aug 2000, so it's 
history about history now.

BTW, in their case, it was for multicast, so it made a bit more sense than 
sending congestion notification to all receivers then having to suppress 
all but one of the responses, but still it had a multi-bottleneck implosion 
problem, and DoS vulnerability...


/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
The earliest ref I can find on source quench/explicit congestion 
notification is the RFC famous for introducing the Nagle algorithm, but 
also including discussion on use of Source Quench for congestion avoidance 
or recovery (about 2/3 the way through Nagle’s RFC896 Jan 84 [15]). This 
led to the use of ICMP Source Quench as the mandatory IETF approach to 
congestion control for a short while (Section 2.2.3 of Postel’s router 
requirements RFC1009 in Jun 87 [4]). Even at that time, RFC 1009 allowed 
active queue management for congestion avoidance rather than recovery. 
However, it was already admitted in that RFC that SQ wasn’t the ideal 
solution and research was continuing. The arguments against use of Source 
Quench that led to the change of gateway requirements are summarised in 
RFC1254 (section 3.1, Aug 1991 [14]), which gives an excellent set of 
further references. The arguments seem to have been more a result of an 
unfortunate sequence of events. Essentially, there were so many different 
algorithms for sending source quench that it wasn’t clear what a source 
should assume was happening when it got one - congestion onset or a router 
had actually run out of buffer. Packet drop, on the other hand, was a 
clearer indication that resources had run out. By Nov 1994 source quench 
was a definite ’SHOULD NOT’ in the draft router requirements RFC 
(Almquist’s RFC1716 [1]). The alternative approach to congestion control 
was deliberately vague due to ongoing research, but always involved drop of 
packets in some form - outlined in Section 5.3.6, referring to papers of 
this time such as [13, 6, 16, 9]. SQ was described as a weak mechanism, 
perhaps because of the above arguments, but also perhaps because it was 
generally only signalled statistically to avoid congestion avalanche. Use 
of explicit congestion notification first appeared in [10] in the DEC DNA 
protocol.

Other more solid reasons against the use of SQ do exist:
• SQ violates layering (congestion control is assumed to be end to end in 
the Internet Architecture), so it can interact badly with IPsec encryption 
and tunneling. If a router decides to send an SQ message in response to a 
packet, it is meant to include the first 64b of the original packet as an 
ICMP payload. When this arrives at the sending IP stack, to work out which 
socket to pass it to, for most protocol types it can work out which port 
the original packet came from, which it finds from the original header that 
has been returned to it. With IPsec it can use the SPI field [3]. But if 
IPsec was in tunnel mode, the tunnel ingress hasn't got enough information 
to find out the original source in order to forward on (backward on?) the 
SQ message.
• ICMP packet creation is normally (always) implemented on the ingress 
interface of a router. Congestion is detected on the egress, where it is 
too late, and too expensive wrt. critical processing time, to trigger the 
creation of an ICMP packet, without hugely re-working the implementation 
architecture of most routers.
• SQ creates more data (although admittedly in the other direction) during 
congestion, which is generally to be avoided.
• SQ may get lost, and there’s no ’end-middle’ reliable channel between 
source and bottleneck to recover losses.
• SQ messages can be sent to hosts as if they came from an on-path router, 
thus creating a DoS vulnerability.

References

[1] P. Almquist and F. Kastenholz. Towards requirements for IP routers. 
Request for comments 1716, Internet
Engineering Task Force, URL: rfc1716.txt, November 1994. (Obsoleted by 
RFC1812) (Status: informational).
[2] Hari Balakrishnan, Hariharan Rahul, and Srinivasan Seshan. An 
integrated congestion management architecture
for Internet hosts. Proc. ACM SIGCOMM’99,Computer Communication Review, 
29(4), October
1999.
[3] L. Berger and T. O’Malley. RSVP extensions for IPSEC data flows. 
Request for comments 2207, Internet
Engineering Task Force, URL: rfc2207.txt, September 1997.
[4] R. T. Braden and J. Postel. Requirements for internet gateways. Request 
for comments 1009, Internet
Engineering Task Force, URL: rfc1009.txt, June 1987. (Obsoleted by RFC1812) 
(Status: historic).
[5] A. Dracinschi and Serge Fdida. Congestion avoidance for unicast and 
multicast trac. In Proc. 1st IEEE
Conference on Universal Multiservice Networks (ECUMN 2000), Colmar, France, 
pages 360–368, URL:
http://www-rp.lip6.fr/publications/files/anca/e_ecumn_linux.ps.gz, October 
2000.
[6] G. Finn. A connectionless congestion control algorithm. ACM SIGCOMM 
Computer Communication Review,
19(5), October 1989.
[7] M. Fuchs, C. Diot, T. Turletti, and M. Homan. A naming approach for ALF 
design. In Proc. HIPPARCH’98
workshop, London, URL: ftp://ftp.sprintlabs.com/diot/naming-hipparch.ps.gz, 
June 1998.
[8] Robert Graham. FAQ: Firewall forensics. Web page URL: 
http://www.robertgraham.com/pubs/
firewall-seen.html\#2.4, February 2001.
[9] Van Jacobsen. Congestion avoidance and control. Proc. ACM SIGCOMM’88, 
Computer Communication
Review, 18(4):314–329, 1988.
[10] R. Jain, K. Ramakrishnan, and D. Chiu. Congestion avoidance in 
computer networks with a connectionless
network layer. Technical report DEC-TR-506, Digital Equipment Corporation, 
1987.
[11] Tae Eun Kim, Raghupathy Sivakumar, Kang-Won Lee, and Vaduvur 
Bharghavan. Multicast service dierentiation
in core-stateless networks. In Proc. 1st International COST264 Workshop on 
Networked Group
Communication (NGC’99), volume 1736. Springer LNCS, November 1999.
5
[12] Dong Lin and Robert Morris. Dynamics of random early detection. Proc. 
ACM SIGCOMM’97, Computer
Communication Review, 27(4), October 1997.
[13] A. Mankin, G. Hollingsworth, G. Reichlen, K. Thompson, R. Wilder, and 
R. Zahavi. Evaluation of Internet
performance — FY89. Technical report MTR-89W00216, MITRE Corporation, 
February 1990.
[14] A. Mankin and K. Ramakrishnan. Gateway congestion control survey. 
Request for comments 1254, Internet
Engineering Task Force, URL: rfc1254.txt, July 1991. (Status: informational).
[15] J. Nagle. Congestion control in IP/TCP internetworks. Request for 
comments 896, Internet Engineering
Task Force, URL: rfc896.txt, January 1984. (Status: unknown).
[16] J. Nagle. On packet switches with infinite storage. Request for 
comments 970, Internet Engineering Task
Force, URL: rfc970.txt, December 1985. (Status: unknown).



____________________________________________________________________________
Bob Briscoe, <bob.briscoe@bt.com>      Networks Research Centre, BT Research
B54/77 Adastral Park,Martlesham Heath,Ipswich,IP5 3RE,UK.    +44 1473 645196