[tcpm] Re: draft-ietf-tcpm-ecnsyn-03.txt backwards compatibility

Bob Briscoe <rbriscoe@jungle.bt.co.uk> Fri, 30 November 2007 00:59 UTC

Return-path: <tcpm-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IxuE4-000873-Id; Thu, 29 Nov 2007 19:59:28 -0500
Received: from tcpm by megatron.ietf.org with local (Exim 4.43) id 1IxuE2-00086g-Ah for tcpm-confirm+ok@megatron.ietf.org; Thu, 29 Nov 2007 19:59:26 -0500
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IxuE1-00086X-Qu for tcpm@ietf.org; Thu, 29 Nov 2007 19:59:25 -0500
Received: from smtp3.smtp.bt.com ([217.32.164.138]) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1IxuE0-0005X2-IT for tcpm@ietf.org; Thu, 29 Nov 2007 19:59:25 -0500
Received: from i2kc06-ukbr.domain1.systemhost.net ([193.113.197.70]) by smtp3.smtp.bt.com with Microsoft SMTPSVC(6.0.3790.1830); Fri, 30 Nov 2007 00:59:24 +0000
Received: from cbibipnt05.iuser.iroot.adidom.com ([147.149.196.177]) by i2kc06-ukbr.domain1.systemhost.net with Microsoft SMTPSVC(6.0.3790.1830); Fri, 30 Nov 2007 00:59:24 +0000
Received: From bagheera.jungle.bt.co.uk ([132.146.168.158]) by cbibipnt05.iuser.iroot.adidom.com (WebShield SMTP v4.5 MR1a P0803.399); id 1196384361747; Fri, 30 Nov 2007 00:59:21 +0000
Received: from mut.jungle.bt.co.uk ([10.73.95.185]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id lAU0wjDI015361; Fri, 30 Nov 2007 00:58:53 GMT
Message-Id: <5.2.1.1.2.20071129221357.04986008@pop3.jungle.bt.co.uk>
X-Sender: rbriscoe@pop3.jungle.bt.co.uk
X-Mailer: QUALCOMM Windows Eudora Version 5.2.1
Date: Fri, 30 Nov 2007 00:59:15 +0000
To: Sally Floyd <sallyfloyd@mac.com>
From: Bob Briscoe <rbriscoe@jungle.bt.co.uk>
In-Reply-To: <e1ca99dd39d8c28591005846e362042a@mac.com>
References: <5.2.1.1.2.20071128164030.03e1aa48@pop3.jungle.bt.co.uk> <5.2.1.1.2.20071128164030.03e1aa48@pop3.jungle.bt.co.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -1.201 () ALL_TRUSTED,MIME_QP_LONG_LINE
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
X-OriginalArrivalTime: 30 Nov 2007 00:59:24.0162 (UTC) FILETIME=[3FA6B620:01C832EC]
X-Spam-Score: 1.8 (+)
X-Scan-Signature: 6a45e05c1e4343200aa6b327df2c43fc
Cc: Aleksandar Kuzmanovic <akuzma@northwestern.edu>, "K. K. Ramakrishnan" <kkrama@research.att.com>, Amit Mondal <a-mondal@northwestern.edu>, tcpm@ietf.org
Subject: [tcpm] Re: draft-ietf-tcpm-ecnsyn-03.txt backwards compatibility
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Errors-To: tcpm-bounces@ietf.org

Sally,

[tcpm list added to distr]

At 21:16 28/11/2007, Sally Floyd wrote:
>Bob -
>
>>Hi, we (Toby & I) generally support this I-D, but one big concern on 
>>backward compatibility that Toby just pointed out. Sorry about this 
>>coming during last call, but that always focuses the mind.
>
>Sure.  It is also fine to take this to the list - I believe that it has 
>already been discussed
>in the working group, but perhaps not recently.

I tried to search the list before mailing you, but didn't find previous 
discussion on backward compatibility. Any clues on keywords to find the 
discussion?


>>"  Backwards compatibility:
>>...
>>    In order for TCP node B to send a SYN/ACK packet as ECN-Capable, node
>>    B must have received an ECN-setup SYN packet from node A.  However,
>>    it is possible that node A supports ECN, but either ignores the CE
>>    codepoint on received SYN/ACK packets, or ignores SYN/ACK packets
>>    with the ECT or CE codepoint set.  If the TCP sender ignores the CE
>>    codepoint on received SYN/ACK packets, this would mean that the TCP
>>    connection would not respond to this congestion indication.
>>However,
>>    this seems to us an acceptable cost to pay in the incremental
>>    deployment of ECN-Capability for TCP's SYN/ACK packets.  It would
>>    mean that the sender of the SYN/ACK packet would not reduce the
>>    initial congestion window from two, three, or four segments down to
>>    one segment, as it should.  However, the TCP sender would still
>>    respond correctly to any subsequent CE indications on data packets
>>    later on in the connection.
>
>If the server with the new ECN implementation uses ECN-Capable SYN/ACK
>packets, and the clients use ECN without ECN-Capable SYN/ACK packets, then
>the TCP server's window will open to at most 4380 bytes (RFC 3390).

Sorry, yes, 4380 bytes. I've corrected my repetitions of this mistake in 
the rest of this posting so as not to confuse those joining the thread.

>So if all of the data transfers are at most 4380 bytes, then you are 
>right, there won't be effective end-to-end congestion control.
>
>In the worst case, the buffer at the congested link will overflow, and 
>some of the future arriving SYN/ACK packets will be dropped rather than 
>ECN-marked. So the worst case would be to revert to a Drop-Tail world, 
>losing the benefit of ECN.

Yes, I'd considered that. But the worst case is worse. Arriving packets 
from /other/ ECN-capable transfers sharing the link will also be dropped, 
not just arriving SYN/ACK packets from the server(s) that aren't responding 
to ECN notifications.

It's worse still. It's like the difference between Good-ECN and Bad-ECN 
under mild congestion in your note at 
<http://www.aciri.org/floyd/ecn/ecn_congestion.txt>, quoting:
"...The performance differences between "Good ECN" and "Bad ECN" TCP can be
similarly dramatic.  In an environment of mild congestion, the "Bad
ECN" TCP will receive much more bandwidth than the "Good ECN" TCP. ..."
It is likely that the short flows won't drive a link into serious 
congestion because the response of Good-ECNs will keep it in a state of 
mild congestion, which gives maximum benefit to the defective ECN+ flows.

This is because an ECN-capable router that is driven to drop, will also be 
ECN-marking a high proportion of the ECN-capable packets it forwards. The 
rate of ECN-capable TCPs sharing the link will be driven down, while the 
short flows that erroneously claim to be ECN-capable will only respond to 
drops, not to ECN marks.

>So, as you say, the question is whether it is necessary to add a TCP 
>option or flag for the TCP initiator to say "I understand ECN-capable 
>SYN/ACK packets.". This would be simple to add, and I don't much care one 
>way or another, but I don't actually think it is necessary.  Why?  (1) 
>Because the worse case is to default to Drop-Tail in any case.

True, the possibility of servers saying they are ECN-capable but then not 
responding to ECN notifications is not the end of the world. But the 
pathological effects on other ECN flows shouldn't be dismissed lightly - 
see above.

>(2) Because I assume that clients will upgrade to ECN implementations that 
>understand ECN-Capable SYN/ACK packets as fast as servers will upgrade to 
>them, and as fast as routers will upgrade to deploying ECN.

On what evidence is this assumption based? It seems shakey to me.

>(3)  In the worst case, a server experiencing serious problems could turn 
>off ECN-Capability for SYN/ACK packets.

It wouldn't neessarily know it was causing nasty problems to others.


>But I am happy for it to be raised on the list, and if I am out-voted, 
>then we can add a TCP option or flag.  That would not be a big deal.  The 
>reason
>not to do it is that it would require either the added bytes for a TCP 
>option or the added cost of using one of the remaining TCP flags, and the 
>added overhead of checking the TCP option or flag on arriving SYN packets, 
>in perpetuity, for what seems to me to be a transient and not very serious 
>problem.  (And it would require changes to procedures that servers use for 
>not having to keep state for SYN packets, if the server wanted to use 
>ECN-Capability on SYN/ACK packets.  But I assume that would be for someone 
>else to take care of.)

Yes, I don't like the idea of a TCP flag for this at all. But this 
shouldn't stop us exploring all the potentially nasty scenarios that may 
arise if we don't have one.


>>"
>>Scenario: A server has deployed ECN+, but many clients of the server have 
>>deployed 3168 ECN but not ECN+. The traffic from the server may 
>>experience congestion, so a few percent of SYN-ACKs will get CE marked. 
>>However, the clients don't set ECN in their ACK of the SYN-ACK. On the 
>>next round, let's say the server opens its initial window to 3 segments 
>>(~4kB) on every connection. With the 3168 ECN clients, it will do this 
>>irrespective of whether congestion was experienced. The lack of 
>>congestion response within a RTT causes the congestion to worsen, so 
>>future data transfers see more CE or get driven into loss. If most 
>>connections continue for more than three data segments, as you say there 
>>will be a congestion response, but it won't be ideal.
>>
>>However, if most responses are satisfied within 3 segments, there will be 
>>absolutely no congestion response on most of the connections with legacy 
>>clients.
>>
>>According to, "File Size Distribution on UNIX Systems—Then and Now", by 
>>A.Tanenbaum (http://www.cs.vu.nl/~ast/publications/osr-jan-2006.pdf), the 
>>distribution of Web file sizes implies (by interpolation) that about 71% 
>>of Web transfers were less than 4380B in 2006:
>><4096B 70.64%
>><8192B 79.69%

I made another error; these stats are about the percentage of files in the 
file-system of a Web server, not the percentage of files served by the Web 
server. However, I guess this is indicative of the likely size of files served.


>Just a nit - what matters most is not the fraction of *transfers* that are 
>short,
>but the fraction of *packets* that are from short transfers.

Indeed. That's not such a nit, because the large files in the long-tail 
represent a disproportionate contribution.

Pimple on your nit: what matters most is not the fraction of packets, but 
the fraction of bytes in short transfers. Because congestion is generally 
caused by bytes.


>>Even worse, it may be a server farm where the admin has upgraded all the 
>>servers to the latest version (e.g. of Linux), while most of the clients 
>>are using a release of Windows without ECN on SYN-ACKs.
>>
>>We didn't want to post this damaging scenario to the list in case you've 
>>taken this into account. But if this is a truly dangerous scenario, ECN+ 
>>will have to have a capability negotiation flag in the TCP options of the 
>>first SYN. This seems rather a small increment to use the scarce TCP 
>>option flags for, so you may have better ideas.
>>
>>You may also want to discuss how/whether ECN+ should be used on a T/TCP 
>>server (RFC1644), given a similar issue.
>
>I don't know much about T/TCP.  Could you say more?

I briefly looked into T/TCP many years ago. It carries data and the FIN in 
the SYN & SYN/ACK for data exchanges that are expected to be short (e.g. an 
RPC or a database query). It was intended for where UDP couldn't be used 
because repeating a query if no response was received would confuse the 
server (e.g. transactional databases). Geoff Huston did a good summary:
<http://www.cisco.com/web/about/ac123/ac147/ac174/ac195/about_cisco_ipj_archive_article09186a00800c83f8.html>

It's experimental status and no-one has taken it thru further 
standardisation. This doesn't necessarily mean it isn't used tho. According 
to this characterisation study (conducted 10/03-01/04), 0.01% of packets at 
two of 11 measurement points used T/TCP (the rest rounded to 0.00%). Nearly 
as much as ECN ;) However, T/TCP may be more prevalent within enterprises.

Given ECN and ECN+ give most gain for short flows, I figured if anyone had 
gone to the bother of implementing T/TCP, they would most likely want to 
implement ECN+.

Also ECN T/TCP clients talking to ECN+ T/TCP servers would raise similar 
issues to those in the main body of this mail if the response carried over 
more than one segment.

Cheers


Bob


____________________________________________________________________________
Bob Briscoe, <bob.briscoe@bt.com>      Networks Research Centre, BT Research
B54/77 Adastral Park,Martlesham Heath,Ipswich,IP5 3RE,UK.    +44 1473 645196  




_______________________________________________
tcpm mailing list
tcpm@ietf.org
https://www1.ietf.org/mailman/listinfo/tcpm