Re: [tcpm] SYN extension using ACK=0 data packets

Bob Briscoe <bob.briscoe@bt.com> Sat, 31 May 2014 22:45 UTC

Return-Path: <bob.briscoe@bt.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3E6AF1A011F for <tcpm@ietfa.amsl.com>; Sat, 31 May 2014 15:45:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.252
X-Spam-Level:
X-Spam-Status: No, score=-3.252 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.651, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HTHkb5qVEBRy for <tcpm@ietfa.amsl.com>; Sat, 31 May 2014 15:45:45 -0700 (PDT)
Received: from hubrelay-by-03.bt.com (hubrelay-by-03.bt.com [62.7.242.139]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1A2521A011C for <tcpm@ietf.org>; Sat, 31 May 2014 15:45:44 -0700 (PDT)
Received: from EVMHR01-UKBR.domain1.systemhost.net (193.113.108.40) by EVMHR03-UKBR.bt.com (10.216.161.35) with Microsoft SMTP Server (TLS) id 8.3.348.2; Sat, 31 May 2014 23:45:34 +0100
Received: from EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) by EVMHR01-UKBR.domain1.systemhost.net (193.113.108.40) with Microsoft SMTP Server (TLS) id 8.3.348.2; Sat, 31 May 2014 23:45:38 +0100
Received: from bagheera.jungle.bt.co.uk (132.146.168.158) by EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) with Microsoft SMTP Server id 14.3.181.6; Sat, 31 May 2014 23:45:38 +0100
Received: from BTP075694.jungle.bt.co.uk ([10.111.109.114]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id s4VMjaen004557; Sat, 31 May 2014 23:45:36 +0100
Message-ID: <201405312245.s4VMjaen004557@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Sat, 31 May 2014 23:45:34 +0100
To: Joe Touch <touch@isi.edu>
From: Bob Briscoe <bob.briscoe@bt.com>
In-Reply-To: <6C4E6E63-F3F4-4364-9459-794957DC8799@isi.edu>
References: <20140425221257.12559.43206.idtracker@ietfa.amsl.com> <2586_1398464386_535ADF82_2586_915_1_535ADF56.9050106@isi.edu> <CF8D8E25-E435-4199-8FD6-3F7066447292@iki.fi> <5363AF84.8090701@mti-systems.com> <5363B397.8090009@isi.edu> <CAO249yeyr5q21-=e6p5azwULOh1_jUsniZ6YPcDYd69av8MMYw@mail.gmail.com> <DCC98F94-EA74-4AAA-94AE-E399A405AF13@isi.edu> <655C07320163294895BBADA28372AF5D2CFE36@FR712WXCHMBA15.zeu.alcatel-lucent.com> <20140503122950.GM44329@verdi> <655C07320163294895BBADA28372AF5D2D009E@FR712WXCHMBA15.zeu.alcatel-lucent.com> <201405221710.s4MHAY4S002037@bagheera.jungle.bt.co.uk> <537E3ACD.5000308@isi.edu> <1AD79820-22C1-4500-84D1-1383F264D68C@weston.borman.com> <201405231213.s4NCDa5P005525@bagheera.jungle.bt.co.uk> <537F8202.4020907@isi.edu> <201405281715.s4SHFMm0014634@bagheera.jungle.bt.co.uk> <538623B9.2060209@isi.edu> <201405301642.s4UGgcvY030471@bagheera.jungle.bt.co.uk> <5388EB6F.4010405@isi.edu> <5389263C.8010202@isi.edu> <201405312113.s4VLDEbG004301@baghee! ra.jungle.bt.co.uk> <6C4E6E63-F3F4-4364-9459-794957DC8799@isi.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
Archived-At: http://mailarchive.ietf.org/arch/msg/tcpm/3sbMeJz0G6RzDzPfpgBi6WWkJIY
Cc: "tcpm@ietf.org" <tcpm@ietf.org>
Subject: Re: [tcpm] SYN extension using ACK=0 data packets
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 31 May 2014 22:45:49 -0000

Joe,

Nothing further to add on this thread.
I do think you're seeing the ASO scheme thru rose-coloured spectacles.

Cheers


Bob

At 23:01 31/05/2014, Joe Touch wrote:
>Hi, Bob,
>
>On May 31, 2014, at 2:13 PM, Bob Briscoe <bob.briscoe@bt.com> wrote:
>
> > Joe,
> >
> > Hope it's ok to have changed the subject line, with tcpm still in cc.
> >
> > I'm afraid I'm not as excited about ACK=0 as you are. It's 
> certainly cleaner than anything we've come up with so far.
> >
> > However, I see the goal as finding a way to send a supplement to 
> a SYN that is invalid in some way to legacy TCP servers, but likely 
> to appear valid to many/most middleboxes. I suspect many 
> middleboxes and firewalls will discard the ACK=0 segment.
> >
> > When assessing the DSO scheme, you were adamant that anything 
> invalid to an endpoint would always eventually become invalid to 
> middleboxes. In the excitement of finding a nice clean way of doing 
> ASO, that critique seems to have suddenly become unimportant.
>
>I worry about checksums in particular - which get recalculated, so 
>using a different checksum means we're sure to fail through a 
>middlebox that doesn't recalculate it properly (if it's stored in a 
>new place), or that won't validate it (if it's in the current checksum field).
>
>I can't say that middleboxes don't check for ACK=0 packets, but it 
>seems a lot of work to do unless they're perceived to be an issue. I 
>doubt they look at the ACK at all unless the SYN or FIN is set.
>
> > Whatever, there may be a place for both solutions:
> > a) ASO (ACK=0) for paths where a NAT might be the worst middlebox 
> on the path
> > b) DSO for paths with stateful firewalls, TCP normalisers etc.
>
>DSO is better through stateful firewalls only when the FBP packet 
>precedes the SYN. Keep in mind that the FBP can also be sent 
>multiple times after the SYN anyway, with small (1ms) delays; only 
>the first will end up being used, and loss of all of them would 
>still be recoverable.
>
> > The cellular world is certainly more like (b).
>
>Cellular uses GGN (carrier-grade NATs), which could make DSO SYN 
>pairing more difficult.
>
> > Whether there are many parts of the public Internet left like (a) 
> is unclear. (b) seems to have become the norm for most paths.
>
>If we're counting numbers, there are a *lot* more hosts behind NATs. 
>The ones behind CGNs probably swamp all others combined.
>
> > More inline...
> >
> > At 01:45 31/05/2014, Joe Touch wrote:
> >> Hi, all,
> >>
> >> Some additional information below about "another trick" I 
> proposed, inspired by Bob's dual-SYN mechanism, courtesy a few long 
> discussions today with Ted Faber.
> >>
> >> I'll be glad to offer this as a potential solution in the doc.
> >>
> >> Joe
> >>
> >> On 5/30/2014 1:34 PM, Joe Touch wrote:
> >> ...
> >>> Here's another trick that might clean up the above a little:
> >>
> >> FWIW, I had explained it below as being based on sending 
> out-of-window data; Ted pointed out that I had been assuming that 
> the FBP ACK bit wasn't set - which means the sequence number might 
> be more usefully matched to that of the SYN.
> >>
> >> See below...
> >>
> >>>     aso - after SYN option
> >>
> >>                length = 2 (just a flag)
> >>                length = 3 (or 4) indicating the length of the
> >>                        FBP expected
> >>
> >>>     FBP - front bumper packet (best I could do on names today)
> >>>         a packet
> >>                ISN = same as the associated SYN
> >>                ACK = cleared (i.e., a data packet NOT part of
> >>                synchronized connection - see why that's useful below)
> >
> > For the avoidance of doubt,
> > I assume these two packets have the same src port.
>
>Same IP addresses, same ports, *because* they're associated with the 
>same connection.
>
> > and I assume the FBP has SYN=0.
>
>Yes. All control bits are 0, including ACK.
>
> > Coincidentally, a couple of weeks ago, when I was looking for 
> places to find more bits in the TCP header, I noted that during an 
> established connection ACK=0 is never used, so I looked into 
> setting ACK=0 and overloading the ack_number field.
> >
> >
> >>>     new endpoint sends:
> >>>
> >>>         SYN + aso + fix_opt
> >>>         FBP + aso + extra_opt
> >>
> >>                extra_opt in the data field
> >>                total length < min MTU (576 for IPv4, 1280 for IPv6)
> >>                again ACK bit is zero
> >>
> >>>             legacy endpoint sends back one connections:
> >>>                 SYN-ACK + fix_opt
> >>>
> >>>                 if seg arrives before SYN,
> >>
> >>                        it is silently dropped, because
> >>                        the ACK bit is clear (this is
> >>                        explicit in RFC793)
> >>
> >>>                 if seg arrives after SYN,
> >>
> >>                        it is silently dropped, because
> >>                        the ACK bit is clear (this is also
> >>                        explicit in RFC793)
> >
> > Two important questions:
> > 1) are most/all legacy TCP implementations faithful to the spec?
>
>That's something I'll take a look at.
>
> > 2) if a TCP endpoint is meant to drop SYN=0, ACK=0, then many 
> middleboxes surely will.
>
>Middleboxes drop things whose checksum fails, or when a packet comes 
>in for a connection that hasn't been established. They don't go out 
>of their way to do a lot of other work AFAICT.
>
> > Both these will need experimental testing.
>
>Certainly.
>
> > Nit: I wouldn't exactly say RFC793 is clear. You have to follow 2 
> pages of quite imprecise descriptive logic, starting from p66. But, 
> yes it is eventually fairly unambiguous. Paraphrasing, it says:
> >
> > SEGMENT ARRIVES
> >        ...
> >        if state = SYN-SENT
> >                1. Check ACK bit
> >                        if ACK = 1
> >                                do stuff
> >                2. Check RST bit
> >                        if RST = 1
> >                                if ACK OK enter CLOSED state
> >                                elif no ACK, drop
> >                3. Check security & precedence
> >                        Do stuff
> >                4. Check SYN bit
> >                        Should only get here if ACK OK, or no ACK and no RST
> >                        if SYN = 1
> >                                do stuff
> >                5. if SYN = 0 and RST = 0
> >                        drop
>
>In CLOSED, it says to send a RST (like any other packet for a non-connection).
>
>In LISTEN, it says:
>
>         Any other control or text-bearing segment (not containing SYN)
>         must have an ACK and thus would be discarded by the ACK
>         processing.  An incoming RST segment could not be valid, since
>         it could not have been sent in response to anything sent by this
>         incarnation of the connection.  So you are unlikely to get here,
>(where 'here' is a segment with ACK=0 and no other control bits set)
>         but if you do, drop the segment, and return.
>
>In SYN-SENT, it says:
>
>       fifth, if neither of the SYN or RST bits is set then drop the
>       segment and return.
>
>In all other states, it says:
>
>       if the ACK bit is off drop the segment and return
>
>If that's not direct and explicit, I don't know what is.
>
> >>>             ----
> >>>
> >>>             new endpoint sends back one connection:
> >>>
> >>>                 SYN-ACK + options + ....
> >>>
> >>>             a) if FBP arrives before SYN,
> >>
> >>                        it can be silently dropped, but
> >>                        it's probably useful for new endpoints
> >>                        to hold onto these (without action)
> >>                        for a while; they can be silently
> >>                        discarded if there are too many
> >>                        (which will just result in a
> >>                        retransmission and an extra RTT)
> >>
> >>>             b) if FPB arrives with the SYN, they can be
> >>>             processed together
> >
> > By 'arrives with' I assume you mean 'within some time duration'.
>
>Yes - I was thinking of the case where the FBP is either cached (as 
>per (a)), or is in the segment queue at the time the SYN is being processed.
>
> >>>                 the SYN-ACK can include responses to
> >>>                 the extra_opts in addition to the
> >>>                 fix_opts, and says "FBP received"
> >>>
> >>>
> >>>             c) if FPB arrives after the SYN:
> >>>
> >>>                 SYN-ACK proceeds, but sends
> >>>                 back "wait for option response".
> >>>
> >>>                 at this point, the source re-sends FBP
> >>>                 until an ACK is received that indicates
> >>>                 "FBP received", or times-out as with
> >>>                 any connection that doesn't finish TWHS
> >
> > There's a dilemma whether the server:
> > * prioritises latency and goes ahead without the extra options,
> > * or prioritise completeness and blocks until the extra options arrive.
>
>There's no dilemma - if the SYN says "FBP is coming", then it means 
>that an upgraded server MUST wait for the FBP.
>
>I.e., the FBP is paired with the SYN-ACK, and the handshake doesn't 
>complete at the client - the client should NOT send the final ACK of 
>the TWHS - until the SYN-ACK is received *and* confirmation that the 
>FBP has been received. That can happen together inside the SYN-ACK 
>if known, or the SYN-ACK would say "FBP missing" and the client 
>would retransmit the FBP *until* the server confirms it (it would 
>send an option confirmation, not an ACK).
>
> > I would take the view that if the FBP is late there's a chance it 
> got snarled up in a middlebox, and it will never get through no 
> matter how many times it's retransmitted.
>
>The client can timeout, and can retry without the extended SYN space 
>in that case.
>
> > Rather than make the choice between latency and completeness at 
> protocol design time, we need a flag in either scheme (on the DSO 
> or ASO option) for the client to say which it wants. Then:
> > * the client can choose latency if it knows the extra_opts are 
> not critical.
> > * otherwise it can choose completeness, and if it doesn't work 
> after a retransmit or two, the client won't block for ever; it can 
> work out the compromise set of options that fit in a single SYN but 
> still 'work'.
>
>A flag that says "do what a legacy server would do if you have to 
>wait X ms" seems fine to me too.
>
> >>>             I'm still thinking as to whether the ACK number
> >>>             might indicate whether FBP has been received,
> >>
> >> There are a few ways to handle this, but IMO the best is to have:
> >>
> >>                SYN-ACK aso with length=2 means "waiting for FBP"
> >>
> >>                SYN-ACK aso with length=3 (or 4) could mean
> >>                "got the FBP", or might even indicate how
> >>                many bytes the FBP extension contained
> >>
> >> Note - the SYN-ACK and all subsequent segments can assume that 
> aso == edo, i.e., they can use the same EDO as spec'd in version 01 
> of this draft.
> >>
> >>> This is cleaner as follows:
> >>>
> >>>     - no need for conn_id coordination
> >>>
> >>>     - no need for conn_id to consume option space for fall-back
> >>>
> >>>     - avoids double-load for legacy servers
> >
> > With a typical legacy server that reflects SYN cookies, less 
> activity is doubled:
> > * my scheme: sends out two SYN cookies
> > * your scheme: sends out one SYN cookie and drops one packet, 
> probably after a long chain of logic, because the FBP is an unusual packet.
>
>The activity involved in parsing a packet is small compared to that 
>of setting up a SYN cookie or creating TCB state (where SYN cookies 
>aren't used).
>
> >>>     - no problem with fate-sharing
> >
> > To be as consistently pessimistic as you were with my scheme, I 
> think you mean that you have created machinery to synthesise fate 
> sharing between a connection and an invalid segment, and it may not 
> work in cases that you may not have thought of.
>
>Your scheme uses two different SYNs on different ports - that's a 
>recipe for fates NOT being shared.
>
>Mine uses two segments on the same addresses with the same ports - 
>if they're not fate-shared, then neither will the rest of the TCP 
>connection be, and TCP won't work (if that fate matters, e.g., 
>through different NATs).
>
> >>>     - traverses a NAT just fine
> >
> > As above, surely TCP normalising middleboxes and incoming 
> stateful firewalls at the server end are likely to discard the FBP.
>
>I don't think so, but I agree we'll have to try. This is a 
>completely different situation, IMO, than expecting boxes that 
>*already* are known to validate TCP checksums to do otherwise. In 
>this case, we don't *know* the solution won't work from the start.
>
> >>> Upgraded servers still need to wait for the 'seg', but they could get
> >>> that retransmitted if necessary.
> >
> > See discussion of latency vs completeness dilemma above.
>
>Latency vs. completeness is true for all variants that use multiple 
>packets - that's one reason I don't like that aspect of any of these 
>approaches.
>
> >> And there's a small amount of additional processing to discard 
> the FBP at legacy endpoints, but silently discarding one packet per 
> connection doesn't seem like a huge effort to me.
> >
> > Despite this being an exceptional drop (see earlier), this is 
> still a reasonably fair statement.
>
>And I'm glad to concede that we don't know the cost on the code 
>cache, etc. of exercising a dusty, dark path of processing.
>
> >> Note - there's no special RST processing, based on Ted's 
> observation about "data without ACK" being considered "data sent in 
> the unsynchronzed state" -- something legacy TCP explicitly silently discards.
> >
> > ...at least in theory.
>
>And in requirements. Yes, I'll see if I can find out what BSD and 
>Linux do - though I have more hope of BSD being correct than Linux.
>
>Joe

________________________________________________________________
Bob Briscoe,                                                  BT