Re: [Plus] Supporting connection tracking and basic diagnostics: a minimal PLUS

Brian Trammell <ietf@trammell.ch> Mon, 12 December 2016 09:44 UTC

Return-Path: <ietf@trammell.ch>
X-Original-To: plus@ietfa.amsl.com
Delivered-To: plus@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4DE6B129599 for <plus@ietfa.amsl.com>; Mon, 12 Dec 2016 01:44:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.798
X-Spam-Level:
X-Spam-Status: No, score=-4.798 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-2.896, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TYvlVlIq_YC8 for <plus@ietfa.amsl.com>; Mon, 12 Dec 2016 01:44:51 -0800 (PST)
Received: from trammell.ch (trammell.ch [5.148.172.66]) by ietfa.amsl.com (Postfix) with ESMTP id 63D5E129514 for <plus@ietf.org>; Mon, 12 Dec 2016 01:44:51 -0800 (PST)
Received: from [IPv6:2001:470:26:9c2::7ea] (unknown [IPv6:2001:470:26:9c2::7ea]) by trammell.ch (Postfix) with ESMTPSA id 18AA51A06FF; Mon, 12 Dec 2016 10:44:20 +0100 (CET)
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
Content-Type: multipart/signed; boundary="Apple-Mail=_31530657-B595-4D9C-B65E-AF5A4A498439"; protocol="application/pgp-signature"; micalg="pgp-sha512"
X-Pgp-Agent: GPGMail
From: Brian Trammell <ietf@trammell.ch>
In-Reply-To: <CAKcm_gMzw9Z=XnFy9vYfxOOG+eO6XJ2bSNeboQYjxW+MOUwa9g@mail.gmail.com>
Date: Mon, 12 Dec 2016 10:44:19 +0100
Message-Id: <494DE77E-7066-4399-B610-200EB49BAB4E@trammell.ch>
References: <83EEC537-3486-4864-ACA2-911F570D0C57@trammell.ch> <15261501-1F9C-41CA-87D0-4E8FCD862044@trammell.ch> <CAKcm_gMzw9Z=XnFy9vYfxOOG+eO6XJ2bSNeboQYjxW+MOUwa9g@mail.gmail.com>
To: Ian Swett <ianswett@google.com>
X-Mailer: Apple Mail (2.3124)
Archived-At: <https://mailarchive.ietf.org/arch/msg/plus/qH8RKzo2aMnpmS3RUZ55xmS6QH8>
Cc: plus@ietf.org
Subject: Re: [Plus] Supporting connection tracking and basic diagnostics: a minimal PLUS
X-BeenThere: plus@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussion of a Path Layer UDP Substrate \(PLUS\) protocol for in-band management of in-network state for UDP-encapsulated transport protocols." <plus.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/plus>, <mailto:plus-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/plus/>
List-Post: <mailto:plus@ietf.org>
List-Help: <mailto:plus-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/plus>, <mailto:plus-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Dec 2016 09:44:54 -0000

hi Ian,

> On 09 Dec 2016, at 16:40, Ian Swett <ianswett@google.com> wrote:
> 
> I have been discussing the first option with people for a while, and everyone has indicated it's relatively simple to implement, potentially even in hardware, and provides quite a bit of extra useful information(RTT, approximate bandwidth) over QUIC's status quo.
> 
> I believe packet numbers being in order makes the first case both easier to implement and provides some valuable extra information(it's easy to infer upstream loss and reordering), so I think it's definitely a better approach.
> 
> Also, given an equal number of bytes spent on the echo, I think they both have about the same fuzziness properties.  For example, if 2 bytes were used in the first case, assuming this echo must be monotonically increasing, an ack or packet gap of over 65,000 packets would have to occur for the signal to be misinterpreted.  If we're getting 65,000 packet gaps, we might be outside the realm of QUIC's wire format to help you diagnose your network issues, to put it mildly.

Indeed.

However, there are two issues to consider here:

(1) ways in which the PSN/echo can fail in terms of exposing information for measurement. Agreed, 16 bits is more than enough here.

(2) ways in which the PSN/echo can fail in terms of exposing information for on-path state maintenance, and this is the fuzziness I was referring to. On further consideration, it's not clear to me this is all that important, though: if you have sufficient bits in the connection ID (and the connection ID appears on each packet), then the additional bits of resistance to off path injection window plausibility checks give you probably aren't worth the effort.

(You do need plausible values for PKT and ECHO on close bits, but since the close packet should be the last packet in sequence in each direction, there's no fuzziness in the window at all).

Cheers,

Brian

> 
> 
> On Fri, Dec 9, 2016 at 6:55 AM, Brian Trammell <ietf@trammell.ch> wrote:
> Greetings, all,
> 
> We were sitting around a whiteboard yesterday, thinking about (1) how to implement the signals required to drive the state machine outlined in the -plus-statefulness-01 draft, and (2) how to provide on-path diagnosability as we're discussing in the thread on Juho's blog post. Indeed, we think these two use cases are by far the most important for PLUS, in that they provide equivalent-to-TCP support for basic operations and troubleshooting practices for encrypted, UDP-encapsulated transports like QUIC, and one could implement either of them in QUIC directly without much disruption.
> 
> We came up with two implementation sketches, both of which use the same header fields to drive the association and confirmation signals as well as basic measurability.
> 
> 
> The simpler one is pretty much the design Jana alluded to in his previous message. It looks kind of like TCP on the wire, and has basically the same properties.It uses three header fields: a connection ID which appears in packets of both directions and is chosen by the connection initiator; a packet serial number (PSN) whose initial value in each direction is chosen randomly by the sender (like the TCP sequence number, and as under discussion for QUIC) and is incremented by one for each packet sent regardless of content or lack thereof (like the QUIC packet number); and a maximum packet serial echo, which is the highest packet number received by the sender before a packet was sent.
> 
> This provides an association signal on the initial echo of the initiator's PSN, and a confirmation signal on the initiator's initial echo of the responder's PSN -- just like the ack numbers on the SYN/ACK and ACK legs of the TCP handshake. The connection ID here simply provides additional bits of protection against completely off-path attempts to force the state machine to tick over. Note there's no need for SYN or ACK flags -- the association and confirmation signals continually demonstrate that each side has seen packets from the other side.
> 
> A stop signal is considered authentic if it has a correct connection ID and a plausible PSN. This is path-verifiable, but provides no protection against on-path or path-side injection attacks against state on middleboxes (though, note that since even unencrypted headers are authenticated, the endpoint can always detect an attempt to inject a stop). A variation of the mechanism described in statefulness-01 would use a two-way stop signal: a stop is only considered valid along the path if one endpoint sends a stop signal in reply to the other endpoint's stop signal. This would make the path-side injection much harder to perform: in order to remove state on a given middlebox (presuming said middlebox isn't stupid about accepting packets from anywhere), the attacker would need to be able to inject packets on the interfaces facing both endpoints.
> 
> One-point measurement of the PSN and echo streams gives you two-sided RTT and upstream loss and reordering. Coordinated two-point analysis gives you a lot more. As noted, though, two-point analysis is far more complex.
> 
> This approach has the advantage of being extremely simple (it meets the "someone with wireshark could reverse-engineer this in an hour or so" requirement), and very close to what's there in QUIC right now. If implemented with a small number of possible header layouts (preferably one) the wire image could be trivially offloaded to hardware.
> 
> It all of the disadvantages as SEQ/ACK tracking in TCP, and of resistance to off-path meddling that TCP sequence numbers do, though it does give better RTT indication on lossy links (since the echo is always the max packet number, not the highest continuous ack), and two-side stop is better than RST at resisting path-side attacks. The definition of "plausible" next PSN or "plausible" echo when seen at a midpoint device is fuzzy, which could lead to difficult to debug problems with middleboxes that try to drop packets with implausible values (as some state-tracking TCP firewalls do now).
> 
> 
> The second one replaces the packet number and the echo with a token and a nonce. The connection initiator chooses an initial random token and a nonce. The connection responder applies a function to the token and nonce to generate its own token, and chooses a random nonce. Each side generates a new token from the token and nonce it receives each time the token it receives changes. Like the simpler implementation, this one provides continual association and confirmation signals. It also provides one-point measurement of RTT, since the token change is an RTT-clocked signal. The RTT clock would also behave odd ways under high-reordering situations, and additional complexity (which involves remembering a few past tokens, but which we didn't work through) would be needed to fix that.
> 
> The token and nonce could be separate from an additional connection ID, or the connection ID and the token could be the same -- though this would require much more state to be kept everywhere in order to allow the connection ID to be useful for NAT rebinding and injection defense purposes.
> 
> The main advantage over the simpler approach is that the fuzziness around plausible PSNs and echoes goes away, as does the predictability of association and confirmation values after the initial connection establishment. However, it does not provide loss measurement without additional information, and it places more state and processing requirements on endpoints.
> 
> 
> Either of these mechanisms could used together with a path-and-endpoint verifiable, on-path and side-path attack resistant stop signal: during connection setup each endpoint generates a random value, and exposes the result of the application of a hash function to that random value as its stop signal proof. To send a stop signal, it reveals the random value as a stop signal verification (this is the essence of PR#20 on QUIC). Any endpoint or on-path device can verify that the hash of the verification is the proof. Of course, devices that don't keep the proof value (or never saw it) can't verify it. The tradeoff here is additional complexity versus additional resistance against path-side injection meddling with state on middleboxes.
> 
> 
> Simple packet number and echo signaling for association and confirmation signaling with two-way stop seems to us like a reasonable "minimal functionality set" at the moment.
> 
> Thoughts?
> 
> Cheers,
> 
> Brian and Mirja
> 
> 
> _______________________________________________
> Plus mailing list
> Plus@ietf.org
> https://www.ietf.org/mailman/listinfo/plus
> 
>