Re: Measurement bit(s) or not
"Brian Trammell (IETF)" <ietf@trammell.ch> Mon, 12 February 2018 12:35 UTC
Return-Path: <ietf@trammell.ch>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AA6BC124E15 for <quic@ietfa.amsl.com>; Mon, 12 Feb 2018 04:35:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZXSxjxW9b8gk for <quic@ietfa.amsl.com>; Mon, 12 Feb 2018 04:35:07 -0800 (PST)
Received: from gozo.iway.ch (gozo.iway.ch [212.25.24.36]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CA4121204DA for <quic@ietf.org>; Mon, 12 Feb 2018 04:35:06 -0800 (PST)
Received: from gozo.iway.ch (localhost [127.0.0.1]) by localhost (Postfix) with ESMTP id B5CB9340E47; Mon, 12 Feb 2018 13:35:03 +0100 (CET)
Received: from localhost (localhost [127.0.0.1]) by localhost (ACF/6597.24373); Mon, 12 Feb 2018 13:35:03 +0100 (CET)
Received: from switchplus-mail.ch (switchplus-mail.ch [212.25.8.236]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by gozo.iway.ch (Postfix) with ESMTPS; Mon, 12 Feb 2018 13:35:03 +0100 (CET)
Received: from [195.176.111.11] (account ietf@trammell.ch HELO public-docking-cx-3449.ethz.ch) by switchplus-mail.ch (CommuniGate Pro SMTP 6.1.18) with ESMTPSA id 45100869; Mon, 12 Feb 2018 13:35:03 +0100
From: "Brian Trammell (IETF)" <ietf@trammell.ch>
Message-Id: <5E9E3102-2F45-46D5-A5C2-7D63085F88F5@trammell.ch>
Content-Type: multipart/signed; boundary="Apple-Mail=_B57F3B4C-0CD6-4D4C-A114-F3B5ED1F4925"; protocol="application/pgp-signature"; micalg="pgp-sha512"
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: Measurement bit(s) or not
Date: Mon, 12 Feb 2018 13:35:02 +0100
In-Reply-To: <27669_1518436598_5A8180F6_27669_174_1_5A8180F8.6020705@orange.com>
Cc: "quic@ietf.org" <quic@ietf.org>
To: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
References: <1817_1518284090_5A7F2D3A_1817_79_1_5A7F2D3E.4050806@orange.com> <aa7a56d01f0a41fe9ad0fd9e61c54c50@usma1ex-dag1mb5.msg.corp.akamai.com> <CAN1APddOWZRF6FxiEcJ4MbOpMwxqHm9=LbMB92pVkdUJNMuMyQ@mail.gmail.com> <CAN1APdcTH=oHdf=wixJZXOCCXcaYKR1ZkJQLDpndRdehuKfvBA@mail.gmail.com> <19F415EA-DC06-4FEE-8AFA-8A6EBEBB9AFA@trammell.ch> <27669_1518436598_5A8180F6_27669_174_1_5A8180F8.6020705@orange.com>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/IPh27PwIxYxVY2DlyCg1EnZInME>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Feb 2018 12:35:09 -0000
hi Alexandre, > On 12 Feb 2018, at 12:56, alexandre.ferrieux@orange.com wrote: > > Hi Brian, > > We're clearly in violent agreement about both the potential to make intelligent > use of this measurement nibble, and the desirability of *some* troubleshooting > helper. Thanks for proposing to include this discussion in the spin bit table. > > As it turns out, we do have concrete illustrations to show there, on real > networks and long distances. But beyond that, I'd like to "probe" this group for > a possible veto before digging further: while I have no doubt about an eventual > consensus among us "troubleshooters", I'm more worried by the other side of the board: people primarily concerned about ossification and linkability, to whom > network debugging is at best a secondary goal. > > More precisely, as you accurately described, the position of the cursor of both > tradeoffs may display a kind of threshold for acceptability by them : > > - endpoint vs midpoint complexity: if the tool is too easy on the midpoint, > active Murphies will come in ; at kind of entry barrier should be set -- how high ? > > - fidelity: the coarser, or the more delayed, the better, since it precludes any > real-time feedback loop like active Murphies do ; what is the minimum > degradation that is needed ? > While our experience with TCP here provides ample, loud warning about the dangers of letting the middle know too much, IMO these worries are a little misplaced with QUIC. Two important things to note: (1) A middlebox that speaks TCP can modify the header without breaking the connection, and it knows which ACK belongs with which packet. It has a wide variety of actions available to it: it can delay or drop packets; coalesce, delay, or even falsify ACKs; manipulate seq/ack and timestamps (though I don't know of any that do the latter), and so on. A middlebox that understands QUIC can only delay or drop a packet. (2) Nobody will buy (or build and deploy) middlebox that can't even pretend to solve a problem. If we've designed QUIC such that selective drop or delay of packets based only on path-observable latency and loss will improve performance for anyone, we have done something horribly wrong. In general, the cost of deriving a measurement from a signal should be paid by the user of the measurement, not the endpoint generating the signal (principle 3 from Principles for Measurability in Protocol Design (https://arxiv.org/pdf/1612.02902.pdf) if you're playing along at home :). This isn't meant to make it harder to mess something up, rather to make it so easy to implement the signal at the endpoint that doing so isn't seen as a negative tradeoff. Fidelity is a hard one. Some troubleshooting tasks turn on the pattern of losses, which means you really want to know *which* packet was lost. Again, knowing which packet was lost doesn't give a QUIC-understanding measurement box as much freedom to get things wrong as the equivalent TCP box. Cheers, Brian > > > On 12/02/2018 10:20, Brian Trammell (IETF) wrote: >> hi Mikkel, Igor, Alexandre, all, >> >> Engineering is fun, but let's step back a bit. :) >> >> It looks like we're exploring a space of proposals that have different >> tradeoffs for the patterns of loss and reordering they can easily make >> visible, tradeoffs for sender (endpoint) versus observer (midpoint) >> complexity, and tradeoffs for fidelity versus overhead. >> >> In any case, it seems like it is possible to design a signal that would be a >> vast improvement (from the measurement utility standpoint) over no signal and >> no discernible pattern in the packet number that will fit in bits scavenged >> from the Type field of the short header, i.e., the bandwidth overhead will be >> *zero*, because otherwise in an encrypted-PN world we just have to grease >> those bits anyway. >> >> Back to Alexandre's question: >> >> Do we want to do this? >> >> Rephrased: Is the passive measurability of loss, reordering (and, if we >> consider the spin bit as one of the measurement bits, latency) of QUIC >> important to us, or do we decide we can live with the negative pressure a >> complete loss of visibility and an vast increase in diagnostic complexity >> will place on deployment? >> >> Note, of course, that all the proposals we have so far represent a decrease >> in visibility and an increase in complexity of measurement compared to >> passive measurement of TCP. New tools will have to be developed. But the loss >> of visibility is minimal compared to blackout, and the deployability and >> feasibility of all of these is far, far better than an SSLKEYLOGFILE-based >> debugging approach, especially in the interdomain case. >> >> I've heard at least one dismissal of this whole space as being too abstract >> to take seriously. (I'm not concerned, but maybe I've been staring at network >> measurement both passive and active for too long to know what's intuitive >> anymore.) Let me then suggest a way forward: >> >> I've announced a table at the London hackathon for "Transport Measurability" >> (see https://trac.ietf.org/trac/ietf/meeting/wiki/101hackathon), which we >> intend to set up in the vicinity of QUIC. This was originally intended as the >> "Spin Bit" table, and we (from ETH) will be there working on scalable, >> open-source passive measurement tools both for the spin bit as well as for >> the current TCP TSOPT and SEQ/ACK methodologies (as a basis of comparison, >> mainly; at least in the case of the spin bit we so far believe the explicit >> signalœ to have superior usability compared to current TCP measurement). I >> suggest we expand the scope of table to hack on various signals for loss and >> reordering, and to compare their complexity and fidelity against the loss and >> reordering patterns we want visibility into. One output of this work could be >> a (smaller) set of suggestion(s) for which signal(s) to add, so that those >> who want to have concrete proposals to evaluate can do so. >> >> Cheers, >> >> Brian >> >> [...] > > > > > >>>>> >>>>> -----Original Message----- From: alexandre.ferrieux@orange.com >>>>> [alexandre.ferrieux@orange.com] Received: Saturday, 10 Feb 2018, >>>>> 12:34PM To: quic@ietf.org [quic@ietf.org] Subject: Measurement bit(s) >>>>> or not >>>>> >>>>> On 07/02/2018 14:34, Brian Trammell (IETF) wrote: >>>>>> hi Jana, >>>>>> >>>>>>> 3. Some sequencing information -- a few bits of the packet number >>>>>>> perhaps -- should be revealed (for monitoring. Number of bits >>>>>>> TBD.) >>>>>> >>>>>> This is the crux of the argument. On one side we have the risk of >>>>>> misuse and ossification (well, not ossification -- these bits are >>>>>> *meant* for the path -- rather the risk that we'll figure out later >>>>>> that we specified the wrong thing), on the other side we have the >>>>>> loss of visibility into how QUIC traffic interacts with the network >>>>>> as compared to TCP, with a side question of whether or not this >>>>>> visibility is really the transport layer's problem despite the >>>>>> evolution the practice of diagnostics and troubleshooting using TCP >>>>>> information. >>>>>> >>>>>> If we can come to agreement on this question, everything else falls >>>>>> into place. I have my arguments here, but as you said, this subthread >>>>>> is not the place for them. :) >>>>> >>>>> The crux indeed. So what about settling it first ? >>>>> >>>>> With the troubleshooting hat, I can only stress the need for >>>>> measurement bits, for the benefit of everybody, since s**t happens, >>>>> networks are imperfect, and nifty encapsulations-with-seqnum will >>>>> simply not be where you need them. >>>>> >>>>> Now to the exact nature of these measurement bits: >>>>> >>>>> Thanks to the detailed exchanges on this thread, it is by now clear >>>>> that a simple gapless counter, even nonzero-based and XORed, is not >>>>> acceptable. The 4-bit SSN comes pretty close but is not enough when >>>>> things go really wrong (and they will - and that's where we need the >>>>> tool). >>>>> >>>>> Then Kazuho's square signal and Mikkel's Pi (or any other consensual >>>>> self-synchronizing sequence) ramification came up. They are both >>>>> appealing for their elegance and low complexity on QUIC endpoints. >>>>> Beyond their quirks acknowledged here, here are a few considerations >>>>> for troubleshooting: >>>>> >>>>> (1) Since reordering is less of a concern to QUIC than to TCP, it >>>>> becomes a secondary goal. This is nice, because the square doesn't see >>>>> it, and the self-synchronizing sequence will only tolerate a mild one, >>>>> and never see its detail like cycle length etc. >>>>> >>>>> (2) There's of course a huge difference between them in complexity for >>>>> the midpoint: square is trivial, Pi is hefty. >>>>> >>>>> Given these, a benevolent, troubleshooting-minded passive midpoint will >>>>> clearly vote for the square. Now the obvious question is: is this >>>>> acceptable, or deemed too easy for a Murphy, Inc. active middlebox to >>>>> see upstream losses and benevolently wreak havoc by delaying packets ? >>>>> >>>>> _________________________________________________________________________________________________________________________ > > > _________________________________________________________________________________________________________________________ > > Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler > a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, > Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. > > This message and its attachments may contain confidential or privileged information that may be protected by law; > they should not be distributed, used or copied without authorisation. > If you have received this email in error, please notify the sender and delete this message and its attachments. > As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. > Thank you. >
- Measurement bit(s) or not alexandre.ferrieux
- Re: Measurement bit(s) or not Gorry (erg)
- RE: Measurement bit(s) or not Lubashev, Igor
- RE: Measurement bit(s) or not Mikkel Fahnøe Jørgensen
- RE: Measurement bit(s) or not Mikkel Fahnøe Jørgensen
- Re: Measurement bit(s) or not Brian Trammell (IETF)
- Re: Measurement bit(s) or not Mikkel Fahnøe Jørgensen
- Re: Measurement bit(s) or not alexandre.ferrieux
- Re: Measurement bit(s) or not Brian Trammell (IETF)
- Re: Measurement bit(s) or not Gorry (erg)