Re: Preparing for discussion on what to do about the multipath extension milestone

Jana Iyengar <jri.ietf@gmail.com> Tue, 06 October 2020 02:37 UTC

Return-Path: <jri.ietf@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DFBF73A0F78 for <quic@ietfa.amsl.com>; Mon, 5 Oct 2020 19:37:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mwhemXstQU8a for <quic@ietfa.amsl.com>; Mon, 5 Oct 2020 19:37:03 -0700 (PDT)
Received: from mail-lj1-x230.google.com (mail-lj1-x230.google.com [IPv6:2a00:1450:4864:20::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 04BED3A0F7A for <quic@ietf.org>; Mon, 5 Oct 2020 19:37:03 -0700 (PDT)
Received: by mail-lj1-x230.google.com with SMTP id n25so9406504ljj.4 for <quic@ietf.org>; Mon, 05 Oct 2020 19:37:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=DZPcbm2hS8MHaLWg2yl/KY8xOItN0knbKFcqY/LuOqo=; b=VP08KNeVf5QcHAOwn32EwCKyhv/baVYVjFArB/Yobcf7Fz6f72UbRXB+K0TNvDBEvn zLlPv9grjKTXS0zEmkQYvYsxT6C5D8RYgGPVxtwmhWFMkTWNc5/7gDBUPSIsFNZPOb3S Po6Emcxwp1iPyvOKyPUyKhNzli6jy7eBye1T3roo33vCVweTleC5/aP/OFrf+gFjaUxl I3xqPDlk/X/tPXVGnuAYO6eNbMiwOfbwtdMQFrVR5YH4IErEWSP3DoRTRVzHD2F5LbPo VHEQYHb4Jls9JEsWoN9UPpPoKYEX+chhGsO3OUFotWhPAkxkBBKmJy1ghrfbUzXfLLmI TrSw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=DZPcbm2hS8MHaLWg2yl/KY8xOItN0knbKFcqY/LuOqo=; b=FGfx6Nu7bKxTkTiPd5dgvQyi/GuZPOZncsQVhQzXpTIXLl3LDO6Z/wrHcmggGFBlg0 kPOyg++ucMWItWjWUHGeUBz7a1q0Gv/PRl83aexcl5sqeMKRL5PNvkwFSqMDNLc8CXf4 45/VfQGszk8+RlY1N9V1xl4vBf53p7Jx1d5fECXQ74Kj/uzjYc52ig+X9irgCnqXf8ms r9IwQyuWOxkeT3XSC1nNM2O8ftVcQND08mdflim9tH63K1oz6XNL7/DXtyVB7Lh/EM5/ sS8yKPD3lkA3byvl8O5zyNFmfgRUBI2xsmtpBQbFcoZfVHe0Z3+FgWGwh5rrtIBbgsmj sL/g==
X-Gm-Message-State: AOAM530QlizdKff7LTvLNF7r4QmQinVz94kJ/BI9gNybEwUhM/q3BE/w 3hKDdYl38vD8nAJMNLo+69faR9jJyVU0xSZG0jE=
X-Google-Smtp-Source: ABdhPJzcIA6TjAdFGWLEQAg4iWRU1rrMW/mTNI83Xq0AMlfgK745gH+ANPm5ScTq0qtxSoCcmDaJXMusT4erFL2M4pQ=
X-Received: by 2002:a2e:5c83:: with SMTP id q125mr795091ljb.387.1601951820957; Mon, 05 Oct 2020 19:37:00 -0700 (PDT)
MIME-Version: 1.0
References: <F0A5E38D-4117-4729-BFF8-72D97CAA9908@eggert.org> <CAKKJt-e=+XLZhNWqaG9YSLTRqyQRvDc-dagUSkFwHOByFwZ++Q@mail.gmail.com> <78651438-2fce-ba67-4f44-4228bbc79a75@uclouvain.be> <CADdTf+hOACZ1x=d8SV-aX0f3vc+_fyqTziRqi5gi+nJgppaz8A@mail.gmail.com> <CAKcm_gNF=0gwrPt=Mr1P=dF_-wmXfz-OJkavFSDe1qrXFeMa4A@mail.gmail.com> <20201002164854.GA2124@MacBook-Pro.local> <CADdTf+heu4DGT8PsF0yL1cknTCB0CiHJ_jBwXZ86ccxL6740qA@mail.gmail.com> <CALGR9ob39AhBQq5kt1tsBp6b3EHy8Aq-PkT_tSX3_hM-u9kYnQ@mail.gmail.com> <00553337-3e40-8630-9d94-04deb03dfc3e@uclouvain.be> <CADdTf+iJJYeAhqSSaiB1HKXNZVa6_xLxHmQPc=rx7=pfKgzm1A@mail.gmail.com> <562bd909-c0c4-1b7f-5b5b-1d2067a3448d@uclouvain.be>
In-Reply-To: <562bd909-c0c4-1b7f-5b5b-1d2067a3448d@uclouvain.be>
From: Jana Iyengar <jri.ietf@gmail.com>
Date: Mon, 05 Oct 2020 19:36:49 -0700
Message-ID: <CACpbDcfz4R-r6=PzS8MwxrZbnMCxFs8giKHY0kFPxZ4LkgNSbA@mail.gmail.com>
Subject: Re: Preparing for discussion on what to do about the multipath extension milestone
To: Olivier Bonaventure <Olivier.Bonaventure@uclouvain.be>
Cc: Matt Joras <matt.joras@gmail.com>, Christoph Paasch <cpaasch@apple.com>, Ian Swett <ianswett=40google.com@dmarc.ietf.org>, Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com>, QUIC WG <quic@ietf.org>, Lucas Pardue <lucaspardue.24.7@gmail.com>
Content-Type: multipart/alternative; boundary="0000000000003c4bb005b0f7795a"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/43gVgHCihq8iJ6fLG_fQ3XZuYgo>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Oct 2020 02:37:06 -0000

After going through this thread, I'm still finding the framing of the
conversation to be: How do we get "multipath" into QUIC? Perhaps I'm not
being very charitable, but the title of this email thread isn't helping.

This framing is problematic and leads to the false scarcity of options we
see in this discussion, but I am still stuck at the premise. Let me explain.

Transport multipath is not a _problem_, it is a _class of solutions_ that
can be implemented in a transport. Multipath is simply the use of multiple
paths. The manner in which multiple paths are used depends on the problem
you are trying to solve.

For example, if you want to solve the handoff problem for reliability,
connection migration is a transport-level multipath solution for it. Note
that there are other ways to solve this problem. For example, if the
application is RESTful and simply fetching data (e.g. YouTube), then you
can implement this in the application quite easily. However, for other
applications, such as live transcription, where context is super important
and transport state is tied to application state, maintaining connection
state is necessary. We agreed that the problem was important -- I believe
that was in Melbourne -- and we then went about solving it with connection
migration with which, after many iterations, we now have a fragile peace.

So echoing what others have been saying on this thread: What is the problem
we are trying to solve? Do we agree on the importance of it? Is this
bandwidth aggregation across WiFi/Cellular (is that an important enough use
case?)? Is it latency reduction for HTTP? Is it increased reliability? In
what ways is it more than connection migration?

I don't want a laundry list of things that "multipath" can do. But an
analysis of the most important problem and why it needs to be solved. This
is where I'd like to engage, not around adoption of a solution, let alone
adoption of a draft.

I've still not heard folks rallying around one problem that we need to
solve right now, and I'm not hearing people saying that we need to do
this before we gain any experience with the existing multipath capabilities
of QUICv1. I definitely believe that we ought to wait for experience with
QUICv1.

Note that the problems MPTCP has solved for TCP can be quite different from
the problems it will need to solve with QUIC. For one, I agree with Lucas
that not talking about streams is like not talking about ordered delivery
with QUIC. Streams are QUIC's primary API model, and we need to understand
how an endpoint does streams across paths. This isn't an MPTCP problem, but
it is one we will need to solve when talking about QUIC. And I like to
remind myself that we spend time in working groups on engineering and
bit-coloring, and I suspect that there's plenty of that in just this one
single problem.

It's a fallacy to think that "this won't take too much time". While that is
never a good answer to why we should work on anything, it rings especially
hollow in the context of our experience building connection migration in
QUIC: it took us a couple of years and we kept stumbling and fixing things,
especially security vulnerabilities.

Because I like to hammer the point in: let's motivate the problem first and
agree on the shape of the problem and its importance. And let's see if we
agree to prioritize it/them over the other things that we are already
planning to work on (datagrams, version negotiation, deploying QUICv1 and
digesting experience.)

And I'll note this because the point has been raised. I agree with
Christian that _if we were to build "general-purpose multipath", I am
thinking of a far simpler (and potentially incremental) design -- maybe
using a single PN space...

... but trying to solve the protocol problem is putting the cart before the
horse. We are all good at protocol design, and we are all great at coloring
each bit just the right shade, even though we are terrible at agreeing on
what the right shade is. Before we dive into coloring the bits, let's
please adequately motivate spending engineering time and resources doing it.

- jana

On Mon, Oct 5, 2020 at 6:24 AM Olivier Bonaventure <
Olivier.Bonaventure@uclouvain.be> wrote:

> Matt,
>
>
> >
> >     Let me try with a simple example on a moving smartphone. The
> >     application
> >     will send small amounts of data and receive variable amounts of data
> >     (depending on the type of requests).
> >
> > I want to start by saying this is a real usecase, and a problem we see
> > very obviously through quality of experience outliers for people using
> > our products.
> >
> >
> >     We create a sending and a receiving uniflow on both the Wi-Fi and the
> >     cellular interface. The smartphone has two sending uniflows and the
> >     server as well.
> >
> >     To send a short request, the client duplicates it over its two
> sending
> >     uniflows since it does not know which of the two uniflows will be
> >     the best.
> >
> >     To return the response, the server could use the same scheduler if
> the
> >     response is short. However, if the response is long, this is not very
> >     efficient since data is sent over two paths. It could then use both
> >     paths to send the data and get the lowest delay to deliver the
> response.
> >     This could be modulated by policies if the user pays on a per volume
> >     basis over one path and not over the other.
> >
> > This somewhat hand-waves away the scheduling problem. Most Internet
> > traffic flows in the direction of server -> client, where typical mobile
> > clients are obviously the ones with potentially multiple interfaces to
> > the Internet. The client also has the most information about the likely
> > quality of the first radio hop into the access network (e.g. signal
> > quality). The client also _may_ know about things like data pricing.
>
> Agreed
>
> > However, the server is the one which is responsible for scheduling most
> > data across these paths. To make good, proactive (rather than reactive)
> > scheduling decisions the server needs to be fed this information as
> > input to its scheduler. This seems a difficult thing to achieve in
> > practice, and without it I wonder whether the complexity of "full"
> > multipath will be worth it until we solve the signalling problem.
>
> There are several techniques that allow the server to learn about the
> different performance of the paths with MPTCP. Those can be applied to
> MPQUIC as well.
>
> If the application uses short requests and short responses, then a
> simple heuristic for the server is to send the response on the path
> where the last packet (data or ack) has been received. The reception of
> a packet is a good indication that a link works.
>
> If the application uses longer responses, then the congestion control
> used by the server over the two paths will enable it to easily find the
> best way to spread the load. If one path has a longer rtt or losses,
> then its congestion window will be slower than the other one and packets
> will naturally flow on the best path, but not only on this path. This
> does not require a specific scheduler and would work better than a
> strict weighted-round-robin scheduler where the client would indicate
> that it wants to receive 2/5 of the bytes over WiFi and 3/5 over cellular.
>
> > What
> > Ian is suggesting above, I think, is essentially an Active/Passive
> > extension to the existing connection migration mechanisms we have today.
> > What are your thoughts on this as an initial direction? It would allow
> > operators a way to solve this particular problem with only mild
> > modifications to the core protocol. Said another way, I think in theory
> > an omniscient packet scheduler can make very intelligent decisions which
> > would definitely benefit application quality. In practice though I have
> > concerns about how effective these schedulers will be versus the naive
> > approach which could be thought of as "Active/Passive" or "Failover",
> > eschewing bandwidth aggregation entirely. I'd also like to echo what
> > Kazuho said, which is that "Multipath" can mean many things, and I'd
> > prefer we narrow down the problem we want to solve in the WG, which will
> > drive our design direction.
>
> When coupled with congestion control, a simple packet scheduler such as
> lowest-rtt first (if paths have the same cost) or priority (for the
> lowest cost path) works well with MPTCP. The same would apply for MPQUIC
>
> >
> >     Another example is the hybrid access network scenario with a DSL and
> an
> >     LTE path. There, the objective is to send data over the LTE path only
> >     when the DSL is full.
> >
> >     In this case, the solution would differ. The client would first
> create
> >     sending and receiving uniflows over the DSL path. It then monitors
> the
> >     usage of this path. As long as the DSL is not fully used for some
> >     period
> >     of time (e.g. one or a few seconds), all data flows over the DSL
> path.
> >     Once the DSL path is saturated, the client creates a receiving
> uniflow
> >     (and possibly a sending one if the DSL upstream is saturated) over
> the
> >     LTE path. The second path can be used to offload traffic. In
> practice,
> >     the client and the server use a priority scheduler to always prefer
> the
> >     DSL over the LTE path, see
> >
> >
> https://inl.info.ucl.ac.be/publications/increasing-broadband-reach-hybrid-access-networks.html
> >
> > For these usecases, are you imagining that they would largely be used
> > internally for Internet operators? I always struggle with the hybrid
> > access examples, as they seem to assume a lot of knowledge about the
> > underlying networks that typical endpoints can't simply intuit from the
> > ether. The linked paper seems to suggest MPTCP as a proxy solution,
>
> Yes, that's deployed with MPTCP proxies running on home routers.
>
> > which I gather is largely the usecase things like ATSSS have for MPQUIC.
> > However, as a mobile application how do I know the DSL link is "full"
> > and thus that I should create a uniflow over the LTE path? The same
> > question applies to the server. It would be awesome if clients and
> > servers had more active information about the underlying network's state
> > rather than being reactive, but beyond things like ECN this seems
> > difficult to achieve in practice. If the Hybrid Access Network usecase
> > of multipath presupposes a deployment in an operator's network then I
> > would argue it is somewhat antithetical to the goals of QUIC, which
> > deliberately puts more control at the endpoints.
>
> On endpoints, you can get the same result by having a target bandwidth
> on the Wi-Fi interface. For example, a smartphone application could
> assume that if the current Wi-Fi bandwidth is below x Mbps then it
> should enable the LTE interface to boost a long download. Similarly, a
> videostreaming application could have a target that corresponds to the
> default quality chosen by the user and enable the LTE when it cannot
> receive video at the expected quality. The same applies for a radio
> streaming application.
>
>
> Olivier
>
>