[tcpm] More TCP option space on SYNs
Bob Briscoe <bob.briscoe@bt.com> Sat, 31 May 2014 18:20 UTC
Return-Path: <bob.briscoe@bt.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 393061A0058 for <tcpm@ietfa.amsl.com>; Sat, 31 May 2014 11:20:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.252
X-Spam-Level:
X-Spam-Status: No, score=-1.252 tagged_above=-999 required=5 tests=[BAYES_05=-0.5, J_CHICKENPOX_34=0.6, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.651, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UCQMRR-elKPP for <tcpm@ietfa.amsl.com>; Sat, 31 May 2014 11:20:05 -0700 (PDT)
Received: from hubrelay-rd.bt.com (hubrelay-rd.bt.com [62.239.224.99]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E86EC1A0063 for <tcpm@ietf.org>; Sat, 31 May 2014 11:20:04 -0700 (PDT)
Received: from EVMHR71-UKRD.domain1.systemhost.net (10.36.3.109) by EVMHR68-UKRD.bt.com (10.187.101.23) with Microsoft SMTP Server (TLS) id 8.3.348.2; Sat, 31 May 2014 19:19:58 +0100
Received: from EPHR01-UKIP.domain1.systemhost.net (147.149.196.177) by EVMHR71-UKRD.domain1.systemhost.net (10.36.3.109) with Microsoft SMTP Server (TLS) id 8.3.348.2; Sat, 31 May 2014 19:19:57 +0100
Received: from bagheera.jungle.bt.co.uk (132.146.168.158) by EPHR01-UKIP.domain1.systemhost.net (147.149.196.177) with Microsoft SMTP Server id 14.3.181.6; Sat, 31 May 2014 19:19:57 +0100
Received: from BTP075694.jungle.bt.co.uk ([10.111.109.114]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id s4VIJt2V003823; Sat, 31 May 2014 19:19:55 +0100
Message-ID: <201405311819.s4VIJt2V003823@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Sat, 31 May 2014 19:19:53 +0100
To: Joe Touch <touch@isi.edu>
From: Bob Briscoe <bob.briscoe@bt.com>
In-Reply-To: <5388EB6F.4010405@isi.edu>
References: <20140425221257.12559.43206.idtracker@ietfa.amsl.com> <2586_1398464386_535ADF82_2586_915_1_535ADF56.9050106@isi.edu> <CF8D8E25-E435-4199-8FD6-3F7066447292@iki.fi> <5363AF84.8090701@mti-systems.com> <5363B397.8090009@isi.edu> <CAO249yeyr5q21-=e6p5azwULOh1_jUsniZ6YPcDYd69av8MMYw@mail.gmail.com> <DCC98F94-EA74-4AAA-94AE-E399A405AF13@isi.edu> <655C07320163294895BBADA28372AF5D2CFE36@FR712WXCHMBA15.zeu.alcatel-lucent.com> <20140503122950.GM44329@verdi> <655C07320163294895BBADA28372AF5D2D009E@FR712WXCHMBA15.zeu.alcatel-lucent.com> <201405221710.s4MHAY4S002037@bagheera.jungle.bt.co.uk> <537E3ACD.5000308@isi.edu> <1AD79820-22C1-4500-84D1-1383F264D68C@weston.borman.com> <201405231213.s4NCDa5P005525@bagheera.jungle.bt.co.uk> <537F8202.4020907@isi.edu> <201405281715.s4SHFMm0014634@bagheera.jungle.bt.co.uk> <538623B9.2060209@isi.edu> <201405301642.s4UGgcvY030471@bagheera.jungle.bt.co.uk> <5388EB6F.4010405@isi.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
Archived-At: http://mailarchive.ietf.org/arch/msg/tcpm/-QxOlgqe00Vr-1QvhZR4bYdB434
Cc: "tcpm@ietf.org" <tcpm@ietf.org>
Subject: [tcpm] More TCP option space on SYNs
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 31 May 2014 18:20:10 -0000
Joe, Thx for consolidating this thread. I've given it a new subject line. 1) You've silently made an important alteration to the proposed protocol. You've put the extra-options directly in the TCP option space of the C-SYN, not within the payload. This creates problems: a) it limits additional options to another 40B, beyond which we will need a third SYN, then a fourth. b) it perpetuates the deployment problem that every newly defined TCP option will have: when deployed in year y, there will be a new crop of middleboxes that only forward options defined up to year y-1. I had deliberately squirrelled the options away in the app layer to: a) provide expansion space for options on a SYN, limited only by the max segment size b) reduce the chances that middleboxes will alter the extra options, given there is a higher bar to altering the payload. c) allow for future structured ways to make extra options invisible and/or immutable by middleboxes. Yes, your altered proposal is cleaner. However, don't imagine I didn't think of this. I did and I deliberately didn't do it this way. We have a choice: clean and vulnerable vs. messy but robust. I'm not wedded to using port 80 and http headers, but this is perhaps the most pragmatic approach. It will be really unorthodox to define such a protocol I know. We would have to say something like "The dst port of the C-SYN MUST be 80, and the payload MUST start with the constant magic_token, where magic_token = 'PUT / HTTP/1.1<CRLF>Connection : DSO<CRLF><CRLF>' " I'm sorry if even thinking about this makes you feel dirty :| Other suggestions for inner protocols are welcome, including tunnelled protocols, as long as middleboxes widely forward them, given their dst port. 2) The main problem with your notation is it doesn't say /where/ the info is placed. I've added notation as follows: TCP(base header [TCP options [APP(header[payload])]]) And for the record I've made the if-else logic clearer. Where I've made more than clarifying edits inline, I've described them and tagged them with [BB]. At 21:34 30/05/2014, Joe Touch wrote: >Hi, Bob, > >Let's get back to the core, in a simpler fashion, so other can follow it. > >I stand by my "there's no way to extend the space in the initial >SYN", but you've convinced me there *might* be a way to provide >extended space that can occur during the first phase of the TWHS. I >think the dual-SYN approach still isn't viable, but I've outlined an >alternative below that's similar but doesn't have the same baggage, IMO. > >Again, I'm still concerned by what midboxes might do to this... > >What do others think?? > >Joe > >For quick review, here's what I understand: > > dso = dual-syn option > dso-D = data > dso-C = control > conn_id = identifier to link the two SYNs together > extra_opt = options that didn't fit in legacy SYN > fit_opt = options that do fit in the legacy SYN new client endpoint sends TCP(port A SYN [dso-D(conn_id) + fit_opt] ) TCP(port B SYN [dso-C [APP(headers [conn_id + extra_opt] ) ] ] ) [BB]: i/APP(headers...)/ if (legacy server endpoint) { sends back two connections: TCP(port A SYN-ACK [fit_opt] ) TCP(port B SYN-ACK [??] ) > (it's interpretation of extra_opt) new client endpoint responds: TCP(port A ACK) (established) TCP(port B RST) > Notes about legacy servers: > - they do twice the work on SYNs > - they might keep twice the state > (if not using cookies) > - they might clean state if the RST > is received, but that state might > persist indefinitely (until the next > connection, depending on timeouts, etc.) > > ----- } elif (new server endpoint) { sends back one connection: TCP(port A SYN-ACK [edo + fit_opt + extra_opt] ) [BB]: s/dso-d/edo/ new client endpoint responds: TCP(port A ACK) (established) > > Notes: > - can stall when dso-D SYN arrives > before dso-C SYN, up to some limit > - twice the work on SYNs (or more) } >Here's what I was assuming, though admittedly it's not documented (yet): > > - no significant impact on TCP connection rate for > legacy servers > > - no significant impact on TCP connection rate for > legacy clients > > - impact dominated by processing the extended option space > for extended clients > > - impact dominated by processing the extended option space > for extended servers > > - compatible with typical TCP processing optimizations, > notably SYN cookies > you did provide a potential way forward for these > > - capable of successfully traversing typical NATs > >Your approach has the following properties: The 3 bullets below are not useful ways to describe performance impact. They selectively describe whichever gives the most pessimistic picture out of: a) either the instantaneous performance change at the moment of connection b) or the worst-case long-run performance impact They don't describe the average long-run performance impact, which is important for sizing machines. Worse, the instantaneous performance impact is only significant when a machine's SYN processing time is large relative to the e2e delay, which would be a highly unusual scenario on public networks (even in scenarios such as intra-data-centre, it's hard to reduce e2e delay to approach SYN processing time, but you could for intra-machine connections). > - halves the server connection rate for updated servers > from legacy clients when this option is in use Eh? The long-run server connection rate will be fractionally decreased due to updated clients using extra options (which is your third case below), but the instantaneous server connection rate seen by a legacy client is unchanged, because it only sends one SYN. > - lowers (to some extent, if not halves) the client > connection rate of updated clients to all servers > when this option is in use > > - halves (roughly) the server rate for all servers > when this option is in use Nope. All long-run server rates are reduced by 1/(1+e), where e is the fraction of connections using extra options. >It also: > > - doubles the number of SYNs in the network Nope. The number of SYNs in the network is inflated by e where e is the fraction of connections using extra options. > - susceptible to lack of fate-sharing problems, e.g., > if the two SYNs experience different firewall configurations Nope. It's fairer to say it's potentially susceptible to second-order fate-sharing problems like your firewall example (the first-order fate sharing problems have been addressed). > - reduces the space available for fit_opt due to the need > for the conn_id even in the fall-back D-SYN, which means > less option space in the SYNs for fall-back connections Yup. > the conn_id which may need to be very large because it > needs to be unique per source port and source IP address > because that information is lost during NAT translation Given many NATs will typically make the src IPs of both SYNs the same, I suggest a larger conn_id should be a fall-back option for the client, not a default. Even if the src IPs of both SYNs are different once they reach the server, the high end bits will invariably be the same. So the max size of the contents of the DSO TCP option can be 6B, and the server can take the rest of conn_id from the higher bits of the src IP addr of each SYN. This is a variant of the idea in <draft-wing-nat-reveal-option>. In fact, the server doesn't even need a small conn_id for clients that know they are not behind a NAT and that want more option space in the D-SYN - then the server could use the src port & src IP for the conn_id. To summarise, these options could be distinguished by the length field of the dual-syn option. Length = 2B => conn_id = src netaddr + src port Length = 6B = 2B +4B conn_id_short => conn_id = src netaddr + conn_id_short Length = 8B = 2B +6B conn_id_long => conn_id = hsb(src netaddr) + conn_id_long Given there have been numerous other attempts to reveal a connection ID that is preserved through middleboxes [RFC6967], rather than defining a dual-syn option that carries a conn_id, we might want to design a TCP connection ID option with a flag to say whether it is also part of a dual SYN pair or not. Where * hsb(src netaddr) is the netaddr with the lowest 16 bits truncated * src netaddr is the network address (IPv4, IPv6, or any other network protocol) To reduce latency, a host could use the default short_conn_id for all connections at first, then: - if it finds that DSO persistently doesn't work it falls back to the long_conn_id for all connections - it occasionally tests the short and zero options to see if it can use shorter DSO options. > - requires the ISNs to be related (see RFC6528 - if there's > a rule to generate it, there will be code to validate that > rule, and eventually a BCP to encourage that validation - > typically from the same RFC author) Eh? The ISNs can and should be independent. To be robust against middleboxes that rewrite sequence numbers, we must not required ISNs to be related. >I agree that you have proposed potentially viable ways to deal with >the SYN cookie, and that RST state is not an issue. A feature that I think it's fair to add: - Good chance of passing through app-layer middleboxes that forward unrecognised TCP options unchanged, but not those that discard them. >However, there are too many problems with this, IMO, to call it viable. Once your over-pessimistic analyses of the performance impact are corrected, and my ideas to reduce the size of the conn_id are taken into account, it's a different story. But it's up to the WG to decide whether this is worth taking further. Not just you or I. >Here's another trick that might clean up the above a little: <snip - I'll respond separately to your later updates on this ASO idea, with ACK=0> Cheers Bob ________________________________________________________________ Bob Briscoe, BT
- [tcpm] Fwd: New Version Notification for draft-to… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Pasi Sarolahti
- Re: [tcpm] New Version Notification for draft-tou… Wesley Eddy
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… John Leslie
- Re: [tcpm] New Version Notification for draft-tou… Yoshifumi Nishida
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Scharf, Michael (Michael)
- Re: [tcpm] New Version Notification for draft-tou… John Leslie
- Re: [tcpm] New Version Notification for draft-tou… Olivier Bonaventure
- Re: [tcpm] New Version Notification for draft-tou… Scharf, Michael (Michael)
- Re: [tcpm] New Version Notification for draft-tou… Costin Raiciu
- Re: [tcpm] New Version Notification for draft-tou… John Leslie
- Re: [tcpm] New Version Notification for draft-tou… John Leslie
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Wesley Eddy
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Costin Raiciu
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… John Leslie
- Re: [tcpm] New Version Notification for draft-tou… John Leslie
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Scheffenegger, Richard
- Re: [tcpm] New Version Notification for draft-tou… Bob Briscoe
- Re: [tcpm] New Version Notification for draft-tou… John Leslie
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Wesley Eddy
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… David Borman
- Re: [tcpm] New Version Notification for draft-tou… Bob Briscoe
- Re: [tcpm] New Version Notification for draft-tou… Bob Briscoe
- Re: [tcpm] New Version Notification for draft-tou… David Borman
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- [tcpm] timestamp options (was Re: New Version Not… Eggert, Lars
- Re: [tcpm] timestamp options (was Re: New Version… Brian Trammell
- Re: [tcpm] timestamp options (was Re: New Version… Scharf, Michael (Michael)
- Re: [tcpm] timestamp options (was Re: New Version… Scheffenegger, Richard
- Re: [tcpm] timestamp options (was Re: New Version… Scharf, Michael (Michael)
- Re: [tcpm] timestamp options (was Re: New Version… Scheffenegger, Richard
- Re: [tcpm] timestamp options (was Re: New Version… Scharf, Michael (Michael)
- Re: [tcpm] timestamp options (was Re: New Version… Yoshifumi Nishida
- Re: [tcpm] timestamp options (was Re: New Version… Joe Touch
- Re: [tcpm] timestamp options (was Re: New Version… Yoshifumi Nishida
- Re: [tcpm] timestamp options (was Re: New Version… Scheffenegger, Richard
- Re: [tcpm] New Version Notification for draft-tou… Bob Briscoe
- Re: [tcpm] New Version Notification for draft-tou… Bob Briscoe
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Bob Briscoe
- Re: [tcpm] New Version Notification for draft-tou… John Leslie
- Re: [tcpm] New Version Notification for draft-tou… Christoph Paasch
- Re: [tcpm] New Version Notification for draft-tou… Bob Briscoe
- Re: [tcpm] New Version Notification for draft-tou… Bob Briscoe
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Olivier Bonaventure
- Re: [tcpm] New Version Notification for draft-tou… Olivier Bonaventure
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] timestamp options (was Re: New Version… Joe Touch
- Re: [tcpm] timestamp options (was Re: New Version… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… John Leslie
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Olivier Bonaventure
- Re: [tcpm] New Version Notification for draft-tou… Olivier Bonaventure
- [tcpm] More TCP option space on SYNs Bob Briscoe
- Re: [tcpm] More TCP option space on SYNs Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- Re: [tcpm] New Version Notification for draft-tou… Joe Touch
- [tcpm] SYN extension using ACK=0 data packets Bob Briscoe
- Re: [tcpm] SYN extension using ACK=0 data packets Joe Touch
- Re: [tcpm] More TCP option space on SYNs Bob Briscoe
- Re: [tcpm] SYN extension using ACK=0 data packets Bob Briscoe
- Re: [tcpm] timestamp options (was Re: New Version… Yoshifumi Nishida
- Re: [tcpm] timestamp options (was Re: New Version… Joe Touch
- Re: [tcpm] timestamp options (was Re: New Version… Mark Allman
- Re: [tcpm] timestamp options (was Re: New Version… Mark Allman
- Re: [tcpm] timestamp options (was Re: New Version… Scharf, Michael (Michael)
- Re: [tcpm] timestamp options (was Re: New Version… Joe Touch
- Re: [tcpm] timestamp options (was Re: New Version… Joe Touch
- Re: [tcpm] timestamp options (was Re: New Version… Scharf, Michael (Michael)
- Re: [tcpm] timestamp options (was Re: New Version… Mark Allman
- Re: [tcpm] timestamp options (was Re: New Version… Mark Allman
- Re: [tcpm] timestamp options (was Re: New Version… Joe Touch
- Re: [tcpm] timestamp options (was Re: New Version… Yoshifumi Nishida
- Re: [tcpm] timestamp options (was Re: New Version… Yuchung Cheng
- Re: [tcpm] timestamp options (was Re: New Version… Mark Allman
- Re: [tcpm] timestamp options (was Re: New Version… Yuchung Cheng
- Re: [tcpm] More TCP option space on SYNs Martin Duke
- Re: [tcpm] More TCP option space on SYNs Joe Touch