Re: [quicwg/base-drafts] QUIC lacks on-path exposure of packet loss (#3189)

Brian Trammell <> Wed, 06 November 2019 20:06 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 5361B1200A4 for <>; Wed, 6 Nov 2019 12:06:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -8
X-Spam-Status: No, score=-8 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id nLpY-iJ-LKsA for <>; Wed, 6 Nov 2019 12:06:50 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E728F120072 for <>; Wed, 6 Nov 2019 12:06:49 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id CEE86961E88 for <>; Wed, 6 Nov 2019 12:06:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=pf2014; t=1573070808; bh=rNxgOoNDysvs0s+9PCuQKWPdiQFjmFXQuSpnF4ms/xg=; h=Date:From:Reply-To:To:Cc:In-Reply-To:References:Subject:List-ID: List-Archive:List-Post:List-Unsubscribe:From; b=avxiG/TZNh2aF2iFIu9o5laOsVHk8N3ioWSQoDO0rk4izIzEEh/Y3JD7rQIzghZf6 YlPRs6mqfBHl+YggoQcIw3ljPMsVp21d0Wv3mlkgBodNoHA7P5DE0QYU0tCB4ULiQu cwRx+nMsc4zMWtJaJdMR25fejzJhyok07RcPAhyo=
Date: Wed, 06 Nov 2019 12:06:48 -0800
From: Brian Trammell <>
Reply-To: quicwg/base-drafts <>
To: quicwg/base-drafts <>
Cc: Subscribed <>
Message-ID: <quicwg/base-drafts/issues/3189/>
In-Reply-To: <quicwg/base-drafts/issues/>
References: <quicwg/base-drafts/issues/>
Subject: Re: [quicwg/base-drafts] QUIC lacks on-path exposure of packet loss (#3189)
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="--==_mimepart_5dc327d8c0369_51373f93a56cd96414458a"; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Precedence: list
X-GitHub-Sender: britram
X-GitHub-Recipient: quic-issues
X-GitHub-Reason: subscribed
X-Auto-Response-Suppress: All
Archived-At: <>
X-Mailman-Version: 2.1.29
List-Id: Notification list for GitHub issues related to the QUIC WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 06 Nov 2019 20:06:52 -0000

(For the record, despite the length of the following comment, as the reporter of #632, let me reiterate that I accept the WG consensus on its closure. But this thread caught my attention because it is relevant to my interests.)

I'm supportive of the goal of this work - actionable measurability is a key property for any transport protocol traversing networks that might occasionally break - but it seems to me we're conflating three questions here:

- The requirement to determine and localize loss in a network through purely passive measurement (i.e., without resorting to encapsulation-based methods, for which e.g. the IOAM work in IPPM would be more appropriate; or just pinging the hell out of everything, which is certainly a common current practice if not a best one). 
- The requirement to support this loss measurement via bits in the transport protocol header. 
- The suitability of the particular mechanism defined in to address these requirements. 

To the first question, I managed to convince myself a while ago that loss at the transport layer is overrated as a metric due to the vagaries of congestion control in the pre-ECN world we inhabit: while the lost packets are the actionable-for-the-operator cause of performance issues, it's the reaction of the transport that is interesting to the endpoint users and therefore likely to annoy the customers -- or in other words, if a packet is lost in the woods and there is nobody to NACK it because it was application-irrelevant, did it make a sound? I clearly haven't convinced anyone else, though, there is a lot of common understanding of what loss means as a metric, and the actions taken in response to that loss do correlate with better user experience, so okay, let's stick with the metric we think we know.

To the second question, as a proponent of the spin bit, I'm obviously a fan of this general class of approaches, given that it was not possible to come to consensus that the IETF should define a common wire image for transports carried over UDP, which IMO would have been a preferable architecture.

To the third question, I am delighted to see that there is a proof of concept showing that this approach works. Running code is great. :) 

I would repeat @DavidSchinazi's point, though, that the privacy implications of this particular mechanism would need to be clarified. The same approach as was followed with the spin bit privacy design team -- determining the space of metrics exposed by the signal, then showing that the marginal privacy risk of exposing those metrics compared to the information available in a version of the protocol without the signal was acceptable -- would work here, but I don't see that that analysis has been done. As one of the people driving the effort on the spin bit analysis, I can say it was a little more than a PAM paper's worth of work.

I'm also not convinced it's the best use of the two bits it wants to allocate. The spin bit, you may recall, was defined in the [IMC paper]( we published to explore its properties as a three bit signal, and the consensus of the working group was that the two bits in the VEC (valid edge counter) could be replaced with better heuristics for bad edge rejection at the measurement analysis point (and shifting the burden to the measurement point, as removing the VEC did, is just [good engineering](, considering the relative numbers of endpoints and measurement points in the Internet). I don't see any analysis done of how well the proposed signal would work, say, if one dropped Q and just used L (or vice versa), or the impact of alternate semantics for L. I haven't done the work (and sadly, my cycles for doing exploratory measurement design ain't what they used to be, so I probably won't do the work), but I suspect a slight redefinition of L to bring its semantics closer to CWR in TCP+ECN would provide comparable metric fidelity to the proposal here at the cost of slightly more measurement-point complexity and one fewer header bit.

I'm a little skeptical that we can come to convergence on the privacy-implications question and the alternate-signal-fitness question in a timeframe that is reasonable for the WG to ship version 1 of the protocol. From my own personal experience, these two questions took about a year of calendar time to answer satisfactorily for the spin bit.

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub: