[tae] Notes from Thursday morning "proto-BOF" at IETF 73

Bryan Ford <baford@mpi-sws.org> Tue, 25 November 2008 22:29 UTC

Return-Path: <tae-bounces@ietf.org>
X-Original-To: tae-archive@ietf.org
Delivered-To: ietfarch-tae-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id F037728C1DD; Tue, 25 Nov 2008 14:29:07 -0800 (PST)
X-Original-To: tae@core3.amsl.com
Delivered-To: tae@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 5011228C1DD for <tae@core3.amsl.com>; Tue, 25 Nov 2008 14:29:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.14
X-Spam-Level:
X-Spam-Status: No, score=-5.14 tagged_above=-999 required=5 tests=[AWL=1.109, BAYES_00=-2.599, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4Z+N7lSCHF1s for <tae@core3.amsl.com>; Tue, 25 Nov 2008 14:29:06 -0800 (PST)
Received: from hera.mpi-sb.mpg.de (infao0809.mpi-sb.mpg.de [139.19.1.49]) by core3.amsl.com (Postfix) with ESMTP id B091028C1D7 for <tae@ietf.org>; Tue, 25 Nov 2008 14:29:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mpi-sb.mpg.de; s=mail200803; h=Message-Id:From:To:Content-Type: Content-Transfer-Encoding:Mime-Version:Subject:Date; bh=46p/2X63 83+T/2e4hZHAS6YH9P4XQcfgqGozbBAm9TU=; b=ihqWv/edvGd+vTwhx68Knagv W4L6paS3NbKEcqifE6CfJu4EjKItUqay2MB/jHCdrkbHqqLi+ClBe+ICmQCE33ey SyKDbsqb4oZO8bpmKQYKQKKA3Mhu5Mc669/xpXrTXblNq9NLuHnf9jqdc99SbIsp xFxpn9Vso+t6ftQWhdM=
Received: from swsao0808.mpi-sb.mpg.de ([139.19.1.27]:57550 helo=tentacle.mpi-sb.mpg.de) by hera.mpi-sb.mpg.de (envelope-from <baford@mpi-sws.org>) with esmtp (Exim 4.69) id 1L56Or-0002Yq-1Q for tae@ietf.org; Tue, 25 Nov 2008 23:28:58 +0100
Received: from p54a59f6d.dip0.t-ipconnect.de ([84.165.159.109]:49464 helo=[192.168.178.36]) by tentacle.mpi-sb.mpg.de with esmtpsa (TLS-1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.63) (envelope-from <baford@mpi-sws.org>) id 1L56Oq-0001sG-KA for tae@ietf.org; Tue, 25 Nov 2008 23:28:52 +0100
Message-Id: <BEF0D2AF-922F-479A-8095-169D369FFC27@mpi-sws.org>
From: Bryan Ford <baford@mpi-sws.org>
To: tae@ietf.org
Mime-Version: 1.0 (Apple Message framework v929.2)
Date: Tue, 25 Nov 2008 23:28:51 +0100
X-Mailer: Apple Mail (2.929.2)
Subject: [tae] Notes from Thursday morning "proto-BOF" at IETF 73
X-BeenThere: tae@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Transport Architecture Evolution <tae.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tae>, <mailto:tae-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/tae>
List-Post: <mailto:tae@ietf.org>
List-Help: <mailto:tae-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tae>, <mailto:tae-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="windows-1252"; Format="flowed"; DelSp="yes"
Sender: tae-bounces@ietf.org
Errors-To: tae-bounces@ietf.org

To kick off this mailing list, I wanted to write up and post a few  
notes from the discussion some of us had Thursday morning at 9AM while  
illegally squatting in the IESG breakout room.  I can't honestly call  
these "minutes" since they don't come close to being complete, and as  
far as I know no one was taking detailed minutes, but if anyone has  
more or complementary notes from this or other related discussions,  
please do post them.

A few generally relevant pieces of background reading, for the record:
- J. Rosenberg, "UDP and TCP as the New Waist of the Internet Hourglass"
	TSVAREA presentation in IETF 71
	http://tools.ietf.org/html/draft-rosenberg-internet-waist-hourglass-00
- B. Ford and J. Iyengar, "Breaking Up the Transport Logjam"
	TSVAREA presentation in IETF 73
	http://www.bford.info/pub/net/logjam-abs.html
- D. Thaler, "Evolution of the IP Model"
	Plenary presentation in IETF 73
	http://tools.ietf.org/html/draft-iab-ip-model-evolution-00

My own "agenda items" for discussion, only some of which we got around  
to:
- UDP encapsulation of transports: TCP, SCTP, DCCP, and new transports.
- How an "Endpoint Layer" could/should evolve beyond UDP
- Negotiation of new transport functionality.  Two sub-issues:
         1. A "new" negotiation mechanism to negotiate among "new"  
transports
	2. A "meta-negotiation" mechanism to negotiate between, say,
	    plain old TCP and the new mechanism supporting many alternatives.
- How the proposed "Flow Regulation Layer" might work, and might evolve
- Relationship of the proposed Endpoint/Flow/Semantic layering scheme  
to:
         - Location/identity split (e.g., how HIP, shim6, et al. fit in)
         - Making transports & applications address-oblivious
		(see David Thaler's plenary talk, and Christian Vogt's talk in RRG
		on a "Hostname-Oriented Network Protocol Stack")

Matt Mathis brought up the pragmatic issue of transitioning to a new  
transport negotiation mechanism like the "Meta-SYN", while retaining  
the ability to fall back transparently on "plain old" TCP for backward  
compatibility.  He suggested one cool idea for solving this problem,  
as follows.  The initiator sends two packets back-to-back: the first  
is the "Meta-SYN" containing the SYNs of all the alternative  
transports, including one for TCP; the second is a "raw" copy of the  
plain TCP SYN.  Legacy responders will just black-hole or otherwise  
reject the Meta-SYN and use the following TCP SYN.  Assuming the two  
packets don't get reordered, though, a "new" responder will see the  
Meta-SYN first, and use the most preferred SYN in the bundle to  
initiate the desired transport connection (or perhaps return a  
challenge for the appropriate transport).  At the same time (while  
processing the Meta-SYN), the responder also sets up a temporary PCB  
matching the bundled TCP SYN, marked "inactive" or "immediate time- 
wait" to prevent the following plain TCP SYN from creating a real TCP  
connection.

Even before this discussion, Tim Shepard brought up the issue of  
finding enough free bits in a TCP SYN packet to negotiate _any_ new  
option or feature, given the problem that the TCP option space is  
already almost full just given the features that have become pretty  
much "mandatory" with modern TCP stacks: e.g., SACK, PAWS timestamp,  
etc.  He suggested we could recover 80 bits to use "somehow" by  
reusing: (a) the 32-bit ack number field, whose contents is undefined  
in a SYN packet (i.e., when the ACK flag is not set); (b) the 16-bit  
urgent pointer, whose contents is undefined when the URG flag is not  
set; and (c) the 32-bit PAWS timestamp echo field, which is defined as  
"supposed to be zero" (but hopefully nobody checks it :) ) before  
there's a timestamp to echo.

Tim summarized this idea in the meeting, and we then discussed various  
ways this idea might be combined with Matt's Meta-SYN negotiation  
idea: e.g., Tim's 80 magic bits in the TCP SYN might contain (the  
first 80 bits of) the SHA-1 hash of a magic statement saying "this  
connection <possibly fill in session ID> wants to support the new Meta- 
SYN protocol", or even just a hash of the Meta-SYN packet itself or  
some nonce in the Meta-SYN, allowing the responder to definitively  
associate the "fallback" TCP SYN with the (already processed) Meta-SYN  
and avoid creating a redundant TCP connection, without actually having  
to create a temporary TIME-WAIT TCB as in Matt's original idea.

In brainstorming about how a "Meta-SYN" transport negotiation scheme  
might be generalized, I suggested considering a dedicate "negotiation  
protocol" that could efficiently negotiate multiple protocol stack  
layers at once (e.g., transport, transport-layer-security, session,  
application...), for example by representing alternative protocol  
stacks or stack fragments explicitly via a directed acyclic graph,  
which the two nodes involved in the negotiation take turns expanding  
and pruning until they arrive at a single agreed-upon stack  
configuration.

On the topic of picking a few specific "alternative transports" with  
which to start testing and experimenting with such mechanisms, Matt  
suggested:
- SCTP, because it's already well-understood, implemented in popular  
kernels, and will highlight a lot of issues
- "Jumbo TCP" from RFC 1263, "TCP Extensions Considered Harmful"

As a shameless plug, I would add to this list my own experimental  
Structured Stream Transport (SST), which is designed to be  
semantically backward compatible with TCP but much more efficient in  
supporting many short-lived and/or highly concurrent streams (http://www.bford.info/pub/net/sst-abs.html 
).  But SST is still very early-experimental and doesn't yet have a  
kernel implementation.

We discussed a bit why we wouldn't just want to use DNS service  
records for transport negotiation, as Joe Touch suggested during my  
presentation.  Two issues we identified with this approach:
- Because of the way DNS SRV records work now, the client would have  
to do several DNS lookups, one for each transport: e.g.,  
__foo.__tcp.bar, __foo.__udp.bar, etc.
- In practice, the client will probably still want to try all of the  
alternative transports if any of them run directly atop IP, since NATs/ 
firewalls in the network are likely to blackhole attempts to connect  
via transports they don't recognize.

Bob Briscoe and Matt Mathis suggested that in my proposed new layering  
scheme (separating out port numbers into an "Endpoint Layer" and  
congestion control into a "Flow Layer"), what remains of the Transport  
Layer shouldn't still be called the Transport Layer, to avoid  
confusion.  I suggested "Semantic Layer" or "Reliability Layer" as  
possible alternatives.

Other related projects or ideas that people brought up at some point  
during the discussion:
- Related to UDP encapsulation: a project called "shim4", by Robert  
Hancock
- Related to Flow Layer segmentation: the Internet 2 "Phoebus" project  
- http://e2epi.internet2.edu/phoebus.html
- Related to "Meta-SYN" and flexible negotiation: Richard Hayton,  
"FlexiNet—a flexible component oriented middleware system" - http://www.ansa.co.uk/ANSATech/ANSAhtml/98-ansa/external/9809Conf/9809fnrp.pdf

Sorry if I forgot anything particularly important, or if I  
misremembered or misrepresented anyone's statements!

Cheers,
Bryan

_______________________________________________
tae mailing list
tae@ietf.org
https://www.ietf.org/mailman/listinfo/tae