[quicwg/base-drafts] QUIC lacks on-path exposure of packet loss (#3189)

Alexandre Ferrieux <notifications@github.com> Mon, 04 November 2019 18:34 UTC

Return-Path: <noreply@github.com>
X-Original-To: quic-issues@ietfa.amsl.com
Delivered-To: quic-issues@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 03843120044 for <quic-issues@ietfa.amsl.com>; Mon, 4 Nov 2019 10:34:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.999
X-Spam-Level:
X-Spam-Status: No, score=-7.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=github.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BmLuClAsGQWw for <quic-issues@ietfa.amsl.com>; Mon, 4 Nov 2019 10:34:05 -0800 (PST)
Received: from out-5.smtp.github.com (out-5.smtp.github.com [192.30.252.196]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1A114120019 for <quic-issues@ietf.org>; Mon, 4 Nov 2019 10:34:05 -0800 (PST)
Received: from github-lowworker-2ef7ba1.ac4-iad.github.net (github-lowworker-2ef7ba1.ac4-iad.github.net [10.52.16.66]) by smtp.github.com (Postfix) with ESMTP id 6988F962031 for <quic-issues@ietf.org>; Mon, 4 Nov 2019 10:34:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com; s=pf2014; t=1572892444; bh=tD965sygKEBlcP82tX5SGBb/HN2ztitvP+DiaDhwXkM=; h=Date:From:Reply-To:To:Cc:Subject:List-ID:List-Archive:List-Post: List-Unsubscribe:From; b=uxoiH63pXy5A8/qeERev3w9MXyK07UDGY2ZPwGkZ0CpdwtSNifCKI/BKAZmx6Fq4d KC5weXk9MKrH5VUEm3lz4tlgsapV1uIduhi+oUoKL7EQy1ViWdEDK5PGVc1BtmsqYy upfF3paefcHknQA9GjNYGqXwfUWDHHoIMgKxy/nc=
Date: Mon, 04 Nov 2019 10:34:04 -0800
From: Alexandre Ferrieux <notifications@github.com>
Reply-To: quicwg/base-drafts <reply+AFTOJK7A72QTPHF56OF2EDN3ZWQZZEVBNHHB5VOOHI@reply.github.com>
To: quicwg/base-drafts <base-drafts@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <quicwg/base-drafts/issues/3189@github.com>
Subject: [quicwg/base-drafts] QUIC lacks on-path exposure of packet loss (#3189)
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="--==_mimepart_5dc06f1c599f4_56cf3fbd66ecd96c15368"; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Precedence: list
X-GitHub-Sender: ferrieux
X-GitHub-Recipient: quic-issues
X-GitHub-Reason: subscribed
X-Auto-Response-Suppress: All
X-GitHub-Recipient-Address: quic-issues@ietf.org
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic-issues/H7O9P-VSQi3E6xuRR-t8RBf-JV0>
X-BeenThere: quic-issues@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Notification list for GitHub issues related to the QUIC WG <quic-issues.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic-issues>, <mailto:quic-issues-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic-issues/>
List-Post: <mailto:quic-issues@ietf.org>
List-Help: <mailto:quic-issues-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic-issues>, <mailto:quic-issues-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Nov 2019 18:34:07 -0000

QUIC exposes the Spin Bit to the path in order to help operators debug their
networks. It allows them to locate segments with an abnormal contribution to
latency by dichotomy, since observing the Spin Bit at a given middle point
allows to compute separately the contribution to RTT of both sides.

This is good, but in many cases, latency is not the crucial element; packet loss
is the real killer. The reason for this lies on the foreseeability of the fault
location:

1. The location of significant latency buildup is most often known in advance: it is the last segment of the access network (capacity bottleneck with a large per-user buffer)
2. Packet loss, by contrast, happens in all segments of the end-to-end path, with low correlation with latency (high-speed interfaces of core routers with limited RAM and large amounts of cross-traffic). 

For a network operator, the key feature is thus the ability to do dichotomy on
packet loss. One might imagine that it is doable by simultaneous packet captures
on both ends of a suspected segment; this simply doesn't help further
investigations on other segments, since one doesn't know in which direction to
proceed (upstream or downstream). Dichotomy is impossible.

In this context, it is very frustrating to anybody in network operations to see
QUIC v1 ship with the less important of on-path measurements, the Spin Bit, and
without an effective way to identity and locate loss. This means several years
of near blindness on network issues and degraded network QoS for QUIC. Everybody
(including content providers and customers) will directly or indirectly suffer
from this.

Early in the discussions around the Spin Bit, two "reserved" bits have been
secured "for experimentation" in the first byte of short headers. These bits are
currently unused. We propose to use them for on-path exposure of packet loss in

https://datatracker.ietf.org/doc/draft-ferrieuxhamchaoui-quic-lossbits/

Note the efficiency of the method ("QL bits") has been verified in the field
through a full-size experiment in 4 countries conducted by Orange and Akamai, as
described at IETF105 in TSVWG and MAPRG

https://datatracker.ietf.org/meeting/105/materials/slides-105-tsvwg-sessb-32-loss-signaling-for-encrypted-protocols-03

https://datatracker.ietf.org/meeting/105/materials/slides-105-maprg-packet-loss-signaling-for-encrypted-protocols-01

The purpose of the current Issue is to raise awareness of the upcoming
difficulty to debug networks in the current state of the specification, and to
request the inclusion of the QL bits in the transport draft, with the same
status as the Spin Bit – “experimental” in v1 and with anti-ossification
requirement to only activate the QL bits on no more than 80% of connections. We
are hereby suggesting the constitution of a design team to conduct the same
analysis of privacy and ossification impacts as was done for the Spin Bit.


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/quicwg/base-drafts/issues/3189