Re: [tcpm] New Version Notification for draft-touch-tcpm-tcp-edo-01.txt

Joe Touch <touch@isi.edu> Fri, 30 May 2014 20:36 UTC

Return-Path: <touch@isi.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AC2301A6EE6 for <tcpm@ietfa.amsl.com>; Fri, 30 May 2014 13:36:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.851
X-Spam-Level:
X-Spam-Status: No, score=-4.851 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.651] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wBOG0X6wVaSN for <tcpm@ietfa.amsl.com>; Fri, 30 May 2014 13:36:39 -0700 (PDT)
Received: from vapor.isi.edu (vapor.isi.edu [128.9.64.64]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 806D81A212D for <tcpm@ietf.org>; Fri, 30 May 2014 13:36:37 -0700 (PDT)
Received: from [128.9.160.166] (abc.isi.edu [128.9.160.166]) (authenticated bits=0) by vapor.isi.edu (8.13.8/8.13.8) with ESMTP id s4UKYtK2028140 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Fri, 30 May 2014 13:34:55 -0700 (PDT)
Message-ID: <5388EB6F.4010405@isi.edu>
Date: Fri, 30 May 2014 13:34:55 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0
MIME-Version: 1.0
To: Bob Briscoe <bob.briscoe@bt.com>
References: <20140425221257.12559.43206.idtracker@ietfa.amsl.com> <2586_1398464386_535ADF82_2586_915_1_535ADF56.9050106@isi.edu> <CF8D8E25-E435-4199-8FD6-3F7066447292@iki.fi> <5363AF84.8090701@mti-systems.com> <5363B397.8090009@isi.edu> <CAO249yeyr5q21-=e6p5azwULOh1_jUsniZ6YPcDYd69av8MMYw@mail.gmail.com> <DCC98F94-EA74-4AAA-94AE-E399A405AF13@isi.edu> <655C07320163294895BBADA28372AF5D2CFE36@FR712WXCHMBA15.zeu.alcatel-lucent.com> <20140503122950.GM44329@verdi> <655C07320163294895BBADA28372AF5D2D009E@FR712WXCHMBA15.zeu.alcatel-lucent.com> <201405221710.s4MHAY4S002037@bagheera.jungle.bt.co.uk> <537E3ACD.5000308@isi.edu> <1AD79820-22C1-4500-84D1-1383F264D68C@weston.borman.com> <201405231213.s4NCDa5P005525@bagheera.jungle.bt.co.uk> <537F8202.4020907@isi.edu> <201405281715.s4SHFMm0014634@bagheera.jungle.bt.co.uk> <538623B9.2060209@isi.edu> <201405301642.s4UGgcvY030471@bagheera.jungle.bt.co.uk>
In-Reply-To: <201405301642.s4UGgcvY030471@bagheera.jungle.bt.co.uk>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Archived-At: http://mailarchive.ietf.org/arch/msg/tcpm/TGI_LALLl5oRBNx-zkUJxOQolmw
Cc: David Borman <dab@weston.borman.com>, "tcpm@ietf.org" <tcpm@ietf.org>
Subject: Re: [tcpm] New Version Notification for draft-touch-tcpm-tcp-edo-01.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 May 2014 20:36:41 -0000

Hi, Bob,

Let's get back to the core, in a simpler fashion, so other can follow it.

I stand by my "there's no way to extend the space in the initial SYN", 
but you've convinced me there *might* be a way to provide extended space 
that can occur during the first phase of the TWHS. I think the dual-SYN 
approach still isn't viable, but I've outlined an alternative below 
that's similar but doesn't have the same baggage, IMO.

Again, I'm still concerned by what midboxes might do to this...

What do others think??

Joe

For quick review, here's what I understand:

		dso = dual-syn option
			dso-D = data
			dso-C = control
		conn_id = identifier to link the two SYNs together
		extra_opt = options that didn't fit in legacy SYN
		fit_opt = options that do fit in the legacy SYN

	new endpoint sends
		port A SYN + dso-D + conn_id + fit_opt
		port B SYN + dso-C + conn_id + extra_opt

			legacy endpoint sends back two connections:
				port A SYN-ACK + fit_opt
				port B SYN-ACK + ??
				(it's interpretation of extra_opt)
			new endpoint responds:
				port A ACK (established)
				port B RST

			Notes about legacy servers:
				- they do twice the work on SYNs
				- they might keep twice the state
				(if not using cookies)
				- they might clean state if the RST
				is received, but that state might
				persist indefinitely (until the next
				connection, depending on timeouts, etc.)

			-----

			new endpoint sends back one connection:
				port A SYN=ACK + dso-D + ....
			
			Notes:
				- can stall when dso-D SYN arrives
				before dso-C SYN, up to some limit
				- twice the work on SYNs (or more)

Here's what I was assuming, though admittedly it's not documented (yet):

	- no significant impact on TCP connection rate for
	legacy servers

	- no significant impact on TCP connection rate for
	legacy clients

	- impact dominated by processing the extended option space
	for extended clients

	- impact dominated by processing the extended option space
	for extended servers

	- compatible with typical TCP processing optimizations,
	notably SYN cookies
		you did provide a potential way forward for these

	- capable of successfully traversing typical NATs

Your approach has the following properties:

	- halves the server connection rate for updated servers
	from legacy clients when this option is in use

	- lowers (to some extent, if not halves) the client
	connection rate of updated clients to all servers
	when this option is in use

	- halves (roughly) the server rate for all servers
	when this option is in use

It also:

	- doubles the number of SYNs in the network

	- susceptible to lack of fate-sharing problems, e.g.,
	if the two SYNs experience different firewall configurations

	- reduces the space available for fit_opt due to the need
	for the conn_id even in the fall-back D-SYN, which means
	less option space in the SYNs for fall-back connections

	the conn_id which may need to be very large because it
	needs to be unique per source port and source IP address
	because that information is lost during NAT translation

	- requires the ISNs to be related (see RFC6528 - if there's
	a rule to generate it, there will be code to validate that
	rule, and eventually a BCP to encourage that validation -
	typically from the same RFC author)

I agree that you have proposed potentially viable ways to deal with the 
SYN cookie, and that RST state is not an issue.

However, there are too many problems with this, IMO, to call it viable.

>> Ultimately, it's roughly equivalent to "try both and shut down the one
>> you don't want" -- the dual-stack approach we already know about.
>
> That's a bit of a wild dismissal of a scheme that so far seems to have
> pretty useful properties.

I said it was roughly equivalent - if you call that "wild dismissal", so 
be it.

Here's another trick that might clean up the above a little:

	aso - after SYN option

	FBP - front bumper packet (best I could do on names today)
		a packet with a sequence number BEFORE the ISN of
		the SYN

	new endpoint sends:

		SYN + aso + fix_opt
		FBP + aso + extra_opt + seqno-outsidewindow

			legacy endpoint sends back one connections:
				SYN-ACK + fix_opt

				if seg arrives before SYN, legacy
				will send a RST - which the new endpoint
				can ignore

				if seg arrives after SYN, it'll be
				discarded as out of window

			----

			new endpoint sends back one connection:

				SYN-ACK + options + ....

			a) if FBP arrives before SYN, it will generate
			a RST - which the new endpoint will ignore

			b) if FPB arrives with the SYN, they can be
			processed together

				the SYN-ACK can include responses to
				the extra_opts in addition to the
				fix_opts, and says "FBP received"

			c) if FPB arrives after the SYN:

				SYN-ACK proceeds, but sends
				back "wait for option response".

				at this point, the source re-sends FBP
				until an ACK is received that indicates
				"FBP received", or times-out as with
				any connection that doesn't finish TWHS

			I'm still thinking as to whether the ACK number
			might indicate whether FBP has been received,
			e.g., send back ACK = ISN -1 until FBP,
			at which point it sends back ISN. The connection
			proceeds when ISN is ACK'd.

This is cleaner as follows:

	- no need for conn_id coordination

	- no need for conn_id to consume option space for fall-back

	- avoids double-load for legacy servers

	- no problem with fate-sharing

	- traverses a NAT just fine

Upgraded servers still need to wait for the 'seg', but they could get 
that retransmitted if necessary.

It causes additional processing for legacy endpoints, but not 
necessarily in a way that decreases the rate because it's not trying to 
start a second connection; the RSTs sent back are lightweight, and the 
additional ACK.

----