Re: [tcpm] New Version Notification for draft-touch-tcpm-tcp-edo-01.txt

Bob Briscoe <bob.briscoe@bt.com> Wed, 28 May 2014 17:15 UTC

Return-Path: <bob.briscoe@bt.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5CA071A0499 for <tcpm@ietfa.amsl.com>; Wed, 28 May 2014 10:15:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.652
X-Spam-Level:
X-Spam-Status: No, score=-2.652 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_41=0.6, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.651, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jckYBfPhXlWa for <tcpm@ietfa.amsl.com>; Wed, 28 May 2014 10:15:33 -0700 (PDT)
Received: from hubrelay-by-03.bt.com (hubrelay-by-03.bt.com [62.7.242.139]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 93BBE1A01C3 for <tcpm@ietf.org>; Wed, 28 May 2014 10:15:32 -0700 (PDT)
Received: from EVMHR71-UKRD.domain1.systemhost.net (10.36.3.109) by EVMHR03-UKBR.bt.com (10.216.161.35) with Microsoft SMTP Server (TLS) id 8.3.348.2; Wed, 28 May 2014 18:15:26 +0100
Received: from EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) by EVMHR71-UKRD.domain1.systemhost.net (10.36.3.109) with Microsoft SMTP Server (TLS) id 8.3.348.2; Wed, 28 May 2014 18:15:25 +0100
Received: from bagheera.jungle.bt.co.uk (132.146.168.158) by EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) with Microsoft SMTP Server id 14.3.181.6; Wed, 28 May 2014 18:15:25 +0100
Received: from BTP075694.jungle.bt.co.uk ([10.109.233.189]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id s4SHFMm0014634; Wed, 28 May 2014 18:15:22 +0100
Message-ID: <201405281715.s4SHFMm0014634@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Wed, 28 May 2014 18:13:24 +0100
To: Joe Touch <touch@isi.edu>
From: Bob Briscoe <bob.briscoe@bt.com>
In-Reply-To: <537F8202.4020907@isi.edu>
References: <20140425221257.12559.43206.idtracker@ietfa.amsl.com> <2586_1398464386_535ADF82_2586_915_1_535ADF56.9050106@isi.edu> <CF8D8E25-E435-4199-8FD6-3F7066447292@iki.fi> <5363AF84.8090701@mti-systems.com> <5363B397.8090009@isi.edu> <CAO249yeyr5q21-=e6p5azwULOh1_jUsniZ6YPcDYd69av8MMYw@mail.gmail.com> <DCC98F94-EA74-4AAA-94AE-E399A405AF13@isi.edu> <655C07320163294895BBADA28372AF5D2CFE36@FR712WXCHMBA15.zeu.alcatel-lucent.com> <20140503122950.GM44329@verdi> <655C07320163294895BBADA28372AF5D2D009E@FR712WXCHMBA15.zeu.alcatel-lucent.com> <201405221710.s4MHAY4S002037@bagheera.jungle.bt.co.uk> <537E3ACD.5000308@isi.edu> <1AD79820-22C1-4500-84D1-1383F264D68C@weston.borman.com> <201405231213.s4NCDa5P005525@bagheera.jungle.bt.co.uk> <537F8202.4020907@isi.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
Archived-At: http://mailarchive.ietf.org/arch/msg/tcpm/RuqQtgLAbNhspmx7wSx9f0azYYw
Cc: David Borman <dab@weston.borman.com>, "tcpm@ietf.org" <tcpm@ietf.org>
Subject: Re: [tcpm] New Version Notification for draft-touch-tcpm-tcp-edo-01.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 May 2014 17:15:36 -0000

Joe,

At 18:14 23/05/2014, Joe Touch wrote:


>On 5/23/2014 5:13 AM, Bob Briscoe wrote:
>>David,
>>
>>1) Parallel control channel
>>___________________________
>>Client A sends two SYNs back-to-back to an existing well-known port
>>(e.g. 80).
>
>You can send in whatever order you want; packets will be reordered, 
>lost, and sent along alternate paths.

Of course.

But as I suggested initially, we can standardise the protocol so that 
an upgraded host synthesises shared fate. Eg. a 2-octet TCP option on 
the D-type SYN that says "Hold for x [ms] to wait for the 
supplementary C-type SYN, where x is a lot less than the usual time 
you hold SYN connection state (e.g. x=2). If SYN C hasn't arrived by 
then, continue without it, and discard it if it arrives."

And if the C-SYN arrives first, hold it for y [ms] waiting for the 
corresponding D-SYN. Where for example y=2 as well.

>FWIW, do these use the same source port and ISN?
>         - if they do, it'll reset the connection
>         - if they don't, you're now limiting the number of
>         concurrent connections to roughly half:
>         http://www.isi.edu/touch/pubs/infocomm99/infocomm99-web/

Unless we can think of a way for the C-SYN to be discarded by a 
legacy TCP stack (but not a middlebox), I'm assuming we need to get 
the C-SYN up to the legacy app-layer before discard... So, I was 
assuming they use different source ports (otherwise, as you say, a 
legacy TCP stack could reset the D SYN connection if it arrived second)

The ISN on the C-SYN is redundant - we might be able to think of 
another use for that field.

On an upgraded host, max concurrent connections would only be 
slightly impacted, because of the v small timeout of C-SYNs.

The semantics of the option-space-extension option would be to only 
hold the C-type SYN for a timeout and only the D-type SYN creates 
full connection state (I'm deferring SYN cookie behaviour to tomorrow 
for now - this is a straw man).

Admittedly, the number of concurrent connections a /legacy/ host can 
support could reduce by up to half (if all remote TCP clients are 
sending the new option). But they have the choice of upgrading to the 
new stack to stop wasting their memory.


>>* SYN D, establishes a regular data connection, with sufficient TCP
>>options to be workable but they still fit within the existing 40B option
>>limit.
>>* SYN C establishes another parallel connection to the same well-known
>>port that looks like regular data from the outside (it could even be an
>>extension to HTTP to ensure middleboxes will let it pass), but it talks
>>a new app-layer 'TCP control' protocol inside.
>
>What happens when they arrive out of order? What happens when you 
>get D but not yet C? How long do you wait for C?

See above.
The timeouts might be standardised, or they might be declared in the 
option as a hint (for the host to ignore if it is under stress).


>This is the problem with dual-stack approaches - new endpoints 
>penalize legacy endpoints if there's a stall, and undermine new 
>endpoints if they don't.

Have I satisfied you that this can be solved sufficiently? Bearing in 
mind the gain, it's reasonable to have to accept some pain.


>>If there is no support for the new app-layer protocol on port 80 the
>>control channel just shuts down with a suitable HTTP error, while SYN D
>>has opened a data connection with sufficient TCP options to be workable.
>>If the new app-layer TCP control protocol is supported on port 80, the
>>parallel control channel (C) adds unlimited additional control
>>flexibility to the data channel (D) hardly any added latency.
>>
>>Establishing a similar control channel in the opposite direction would
>>be fairly trivial.
>>
>>There are few, if any, middlebox problems with the above approach.
>>However, there are certainly other problems, but no more insurmountable
>>than all the problems that have already been discussed with taking the
>>'easy' route of EDO:
>>* A secure binding would have to be added to bind channel C to a secret
>>known only to the originator of channel D, otherwise it would open up
>>data channels to spoof control channel attacks. This binding could be
>>built on a TCP-AO option in channel D.
>
>Yes, that's another problem.

Fairly straightforward to solve using standard techniques.


>>* Channel C would need some way to refer to the segments of channel D
>>that was robust against re-segmentation.
>
>Which means it won't work in the current Internet, because 
>resegmentation is also widespread (though evil, IMO).

Well, resegmentation isn't usually a problem on a SYN anyway.

We can't improve on the general pain caused by resegmentation. I was 
only talking about the delta pain that my strawman would suffer, 
where a resegmenting function sees data on my SYN and doesn't 
understand my strawman, so it won't patch up the damage in my 
strawman protocol that it would have patched up by altering sequence 
numbers in a regular single SYN with a payload.

I think this is a corner case.


>>* The main problem is that the two channels don't share fate;a control
>>packet can be delayed relative to the point in the data stream at which
>>it is attempting to exert control, possibly for a RTT if it is lost and
>>has to be retransmitted. However, this is not insurmountable. The
>>control protocol could include a mode to "synthesise shared fate", by
>>making the data channel buffer data until an associated control segment
>>had arrived. This would duplicate the latency impact of a loss or delay
>>on either channel, but one can imagine mitigations that would consign
>>this latency impact to corner cases.
>>* It's a bit of a mess, but that comes with the territory when trying to
>>fix legacy protocol problems.
>>* The internal stack architecture seems to require a trombone back down
>>into the kernel from user-space, but that is not insurmountable - a shim
>>within the kernel on port 80 (for example) could redirect control
>>channel data across to the "TCP control channel module" in the kernel,
>>while passing non-control channel connections to user-space.
>>
>>2) Build on LOIC
>>______________________
>>Long option with invalid checksum <draft-yourtchenko-tcp-loic-00>
>
>Won't work through current NATs, which won't recalculate the 
>checksum properly.

I'm building on the general idea of using something invalid, not 
using the checksum idea specifically.



>>At 18:53 22/05/2014, John Leslie wrote:
>>>    That's too big of a change to ask folks to believe it safe.
>>
>>When I read an idea, I don't take it as set in stone and just find a
>>hole and dismiss it. I see it as a potential stepping stone to a
>>solution and think about how it could be done better. In fact, Andrew
>>Yourtchenko said that was the intention of his write-up of LOIC.
>>
>>I believe that an approach worth further thought would be a mixture of
>>the control channel idea and the invalid checksum idea. I'm thinking of:
>>* a pure control SYN (C) sent first, then a base SYN (D) sent
>>back-to-back, both to the same port.
>
>Again, please don't assume back-to-back means anything.

See above.
Actually, even tho order isn'[t guaranteed, it is important to get 
them in the right order to optimise performance (much as TCP doesn't 
assume perfect order, but it goes smoothest with reasonable ordering).

>>* SYN C would contain something invalid to cause a legacy TCP stack or
>>legacy app to discard it (and hopefully less probability that a
>>middlebox would), e.g. a payload that is invalid for the application
>>protocol on the port.
>
>But so will a NAT, etc.

No. You can design a payload that has headers that a NAT will ignore 
but an end-host will have to process. E.g. HTTP connection control 
headers newly defined for this protocol that would be ignored by 
NATs, without any other HTTP behaviour, so a legacy host does nothing 
at the app layer.

>>* there would be additional TCP options in the payload of SYN C to be
>>added to the TCP options that arrived separately on the base SYN
>>* The control SYN could be bound crytographically to the base SYN (as
>>already described).
>>* It could use the shim-like control stack arangement described earlier.
>>
>>By focusing solely on extending the SYN, this would avoid the ongoing
>>shared fate problems that a separate control channel suffers throughout
>>the connection. There would still be shared fate problems with 2 SYNs
>>(e.g. the two SYNs get re-ordered), but the protocol would have to be
>>designed to be robust to that (naively, SYN D could include a new TCP
>>option that told a new stack to wait a few ticks for a SYN C, but that
>>would be vulnerable to meddleboxes). Not insurmountable.
>
>AFAICT, it is.

Still?


Bob



>*with a nod to Raiders 3.

________________________________________________________________
Bob Briscoe,                                                  BT