Re: [Plus] Blog post on quic and tou

Brian Trammell <ietf@trammell.ch> Thu, 08 December 2016 09:36 UTC

Return-Path: <ietf@trammell.ch>
X-Original-To: plus@ietfa.amsl.com
Delivered-To: plus@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 39E99129416 for <plus@ietfa.amsl.com>; Thu, 8 Dec 2016 01:36:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.798
X-Spam-Level:
X-Spam-Status: No, score=-4.798 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-2.896, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5e5CfwWtfT9k for <plus@ietfa.amsl.com>; Thu, 8 Dec 2016 01:36:34 -0800 (PST)
Received: from trammell.ch (trammell.ch [5.148.172.66]) by ietfa.amsl.com (Postfix) with ESMTP id 9E5AA1293F4 for <plus@ietf.org>; Thu, 8 Dec 2016 01:36:32 -0800 (PST)
Received: from [IPv6:2001:67c:10ec:2a49:8000::10d6] (unknown [IPv6:2001:67c:10ec:2a49:8000::10d6]) by trammell.ch (Postfix) with ESMTPSA id 58C1D1A0651; Thu, 8 Dec 2016 10:36:00 +0100 (CET)
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
Content-Type: multipart/signed; boundary="Apple-Mail=_B9225138-CD8E-4BFF-8AA9-68C45BDA2094"; protocol="application/pgp-signature"; micalg="pgp-sha512"
X-Pgp-Agent: GPGMail
From: Brian Trammell <ietf@trammell.ch>
In-Reply-To: <CAGD1bZaXto+fq9A806ME1bonZv0639Yu5yazyp_eeeOQemeNzw@mail.gmail.com>
Date: Thu, 08 Dec 2016 10:35:59 +0100
Message-Id: <E416974A-BF2C-4D24-B9CF-1591CAE8D6C2@trammell.ch>
References: <6850cb85-6b26-e5b8-50a9-7565c39b0c28@tik.ee.ethz.ch> <CAGD1bZaXto+fq9A806ME1bonZv0639Yu5yazyp_eeeOQemeNzw@mail.gmail.com>
To: Jana Iyengar <jri@google.com>
X-Mailer: Apple Mail (2.3124)
Archived-At: <https://mailarchive.ietf.org/arch/msg/plus/1S4d0n-M1MdPqkKYky_Bmhn-iLQ>
Cc: plus@ietf.org, Mirja Kühlewind <mirja.kuehlewind@tik.ee.ethz.ch>
Subject: Re: [Plus] Blog post on quic and tou
X-BeenThere: plus@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussion of a Path Layer UDP Substrate \(PLUS\) protocol for in-band management of in-network state for UDP-encapsulated transport protocols." <plus.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/plus>, <mailto:plus-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/plus/>
List-Post: <mailto:plus@ietf.org>
List-Help: <mailto:plus-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/plus>, <mailto:plus-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Dec 2016 09:36:36 -0000

hi Jana,

> On 07 Dec 2016, at 22:35, Jana Iyengar <jri@google.com> wrote:
> 
> Thanks for forwarding the article. I'll offer some thoughts (and some corrections.)
> 
> There's surely a solid argument to be made about network monitoring, as this blog post makes. Operators' needs are real, and we need to ensure that they are able to reasonably do the things that they need to do. At least for QUIC, exposing a small additional bit of information addresses ~80% of the use cases mentioned in the document: a "largest acked" ack number on all packets (or at least packets that contain acks). I won't design this mechanism on this list,

(...noting that this list exists *precisely* for designing this mechanism ;) but yes, details should happen on the quic@ list...)

> but I'll note that it's a conversation that's happening in several corners in the QUIC wg. It needs to be aired and discussed, and I expect it to happen relatively soon.

For those not familiar with the details of IETF-quic (which AFAIK from others' implementation reports diverges somewhat from the version of QUIC deployed by Google right now, and will diverge more during the WG's work): the packet number is already exposed. Together with highest-ack, this allows one-observation-point split-RTT measurement with an unknown responder delay term, equivalent to TCP; two-observation-point approaches for loss measurement; and one-observation-point approaches for loss estimation to work with more information about the dynamics of the particular version of QUIC running, also similar to TCP.

I personally think we can do a good deal better than this with epsilon more complexity and overhead, without either constraining QUIC's transport dynamics or requiring measurement devices to know about the details of those dynamics. Need to do a bit more work before I can say how small epsilon is, though.

Missing is more detailed information about TCP dynamics mentioned in the post. Many of these are TCP (and CC algorithm) specific, so it doesn't make much sense to expose the same information, though each of the requirements implicit in the list is worth evaluating separately for its ossification/security/utility tradeoff.

One requirement from that list that seems quite useful, though I don't know how to solve in the general case or in QUIC specifically: "determine [if] the software on the client or the server is the bottleneck". This is a very common triage task in network operations: does this problem indicate a misconfiguration of my network, or (more cynically) can I demonstrate that it's not my fault and therefore not my problem? Requiring access to one or both endpoint machines to answer that... seems like a question for a future Internet architecture research project.

> The article though looks at real needs and current tools that operators have, and over-generalizes to saying that the "entire header" should be visible. My argument remains that only what is absolutely required should be exposed, and that every bit exposed should be debated.

I would go further (again, this is the philosophical underpinning of PLUS, to the extent that it has one): the design of the header that is exposed unencrypted to the network (which constitutes its "wire image") should be treated as an entirely separate endeavour than the design of the transport protocol machinery.

> This is not a security argument, it's an ossification one. The whole point of ossification is that there are third parties that are unresponsive to changes in allegedly e2e protocols. Middleboxes are reactive. If they see traffic shifting a particular way, they'll go build something in response -- I've seen this happen several times. But, they are not proactive. This creates a serious "deployment impossibility cycle" where deploying a protocol change widely requires it to work through a huge range of middleboxes, but even high-end middleboxes will not change behavior in response until the protocol change is widely deployed.

My (possibly starry-eyed optimistic) hope is that a deliberately designed wire image will create a path of least resistance for middlebox designs to (reactively) follow. A well-designed wire image should be so obvious that the in-network reaction will be the one desired by the designers evenfor middleboxes built by people who didn't read the spec.

Cheers,

Brian