Re: My BoF report: multipath

Christian Huitema <huitema@huitema.net> Fri, 23 October 2020 05:08 UTC

Return-Path: <huitema@huitema.net>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0ACC03A11E8 for <quic@ietfa.amsl.com>; Thu, 22 Oct 2020 22:08:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.143
X-Spam-Level:
X-Spam-Status: No, score=-2.143 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.247, SPF_FAIL=0.001, SPF_HELO_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bvXhmCRIkDlD for <quic@ietfa.amsl.com>; Thu, 22 Oct 2020 22:08:13 -0700 (PDT)
Received: from mx36-out10.antispamcloud.com (mx36-out10.antispamcloud.com [209.126.121.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1F5D63A11E7 for <quic@ietf.org>; Thu, 22 Oct 2020 22:08:12 -0700 (PDT)
Received: from xse189.mail2web.com ([66.113.196.189] helo=xse.mail2web.com) by mx13.antispamcloud.com with esmtp (Exim 4.92) (envelope-from <huitema@huitema.net>) id 1kVpJB-0003yE-2y for quic@ietf.org; Fri, 23 Oct 2020 07:08:11 +0200
Received: from xsmtp21.mail2web.com (unknown [10.100.68.60]) by xse.mail2web.com (Postfix) with ESMTPS id 4CHXLD2PdRz1RWg for <quic@ietf.org>; Thu, 22 Oct 2020 22:08:04 -0700 (PDT)
Received: from [10.5.2.31] (helo=xmail09.myhosting.com) by xsmtp21.mail2web.com with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.92) (envelope-from <huitema@huitema.net>) id 1kVpJ6-000382-6R for quic@ietf.org; Thu, 22 Oct 2020 22:08:04 -0700
Received: (qmail 32214 invoked from network); 23 Oct 2020 05:08:03 -0000
Received: from unknown (HELO [192.168.1.107]) (Authenticated-user:_huitema@huitema.net@[172.58.43.139]) (envelope-sender <huitema@huitema.net>) by xmail09.myhosting.com (qmail-ldap-1.03) with ESMTPA for <quic@ietf.org>; 23 Oct 2020 05:08:03 -0000
To: Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com>, Martin Thomson <mt@lowentropy.net>
Cc: IETF QUIC WG <quic@ietf.org>
References: <d84c82b1-fa67-4676-9ce2-d2a53d81b5f7@www.fastmail.com> <CAKKJt-d7iQG-5BsTjR+xKLNt_Ru+h_XpDuaO3JgtAx7+Z3_S0A@mail.gmail.com>
From: Christian Huitema <huitema@huitema.net>
Autocrypt: addr=huitema@huitema.net; prefer-encrypt=mutual; keydata= mDMEXtavGxYJKwYBBAHaRw8BAQdA1ou9A5MHTP9N3jfsWzlDZ+jPnQkusmc7sfLmWVz1Rmu0 J0NocmlzdGlhbiBIdWl0ZW1hIDxodWl0ZW1hQGh1aXRlbWEubmV0PoiWBBMWCAA+FiEEw3G4 Nwi4QEpAAXUUELAmqKBYtJQFAl7WrxsCGwMFCQlmAYAFCwkIBwIGFQoJCAsCBBYCAwECHgEC F4AACgkQELAmqKBYtJQbMwD/ebj/qnSbthC/5kD5DxZ/Ip0CGJw5QBz/+fJp3R8iAlsBAMjK r2tmyWyJz0CUkVG24WaR5EAJDvgwDv8h22U6QVkAuDgEXtavGxIKKwYBBAGXVQEFAQEHQJoM 6MUAIqpoqdCIiACiEynZf7nlJg2Eu0pXIhbUGONdAwEIB4h+BBgWCAAmFiEEw3G4Nwi4QEpA AXUUELAmqKBYtJQFAl7WrxsCGwwFCQlmAYAACgkQELAmqKBYtJRm2wD7BzeK5gEXSmBcBf0j BYdSaJcXNzx4yPLbP4GnUMAyl2cBAJzcsR4RkwO4dCRqM9CHpVJCwHtbUDJaa55//E0kp+gH
Subject: Re: My BoF report: multipath
Message-ID: <f876d34e-0dbd-be85-62f7-7252328454a8@huitema.net>
Date: Thu, 22 Oct 2020 22:08:02 -0700
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1
MIME-Version: 1.0
In-Reply-To: <CAKKJt-d7iQG-5BsTjR+xKLNt_Ru+h_XpDuaO3JgtAx7+Z3_S0A@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------E452636DC5D302CBF8A46FD3"
Content-Language: en-US
X-Originating-IP: 66.113.196.189
X-Spampanel-Domain: xsmtpout.mail2web.com
X-Spampanel-Username: 66.113.196.189/32
Authentication-Results: antispamcloud.com; auth=pass smtp.auth=66.113.196.189/32@xsmtpout.mail2web.com
X-Spampanel-Outgoing-Class: unsure
X-Spampanel-Outgoing-Evidence: Combined (0.15)
X-Recommended-Action: accept
X-Filter-ID: Mvzo4OR0dZXEDF/gcnlw0acxlbLX9+ibl6ixChE10xOpSDasLI4SayDByyq9LIhVUZbR67CQ7/vm /hHDJU4RXkTNWdUk1Ol2OGx3IfrIJKywOmJyM1qr8uRnWBrbSAGDcnqpk5VeF3xR4kF6iVwRtbgN zB/4Jkrw1eDLcif59fusdoAnWH5pAJx5EuR8NtPwU7Tmz6iKnkQL9gqsxD347235Nhqq+/HvroPq 8GSPg+5hmwN8D4LrepG7AX8WNwY8PLhBThvdgyPN49yzDQzRHY6jSvfpO+1kZkomjtjB6X5Q5Q9f RUeIpTIC2ySfqvnqLwoxlgatmaBb0rBiK9xbkDrUqzcKIief90MVLZY9LbIZh9+IQ1oS9LBn3VIP 95Jz7ujRlJ9wSMlhvaudJXZ9EIBG/qaR+8r9SKFMmPJLf850OvZYsmoVQuOIhwKLK6IKBNB4LZ0v UHHKTzJX7b1JhLSQQ4vSj0QEim26t/Moy0UPX5E73H1QfrH/5kkrV/Cr0bm2vWdo8usP65i82q1C dZgGrpL44wdx9eXqjQjbvUopOMQJvQ/Ck3iiU+4DQAj3fuQgzT3K9JUHTNiGwfwAmxx/Wk8McinP JEkgAVrOMpacjGLPDGJdOe4i3f4PkPX9tvoOc9EDKiESiGATO5a8wyN6hZo7TkliYYWPT1tvttYM 7OYFXYdC3tRq275m/U3VsxVZyScTljInRarr9g1LubdCp3Zd9clP8wSiJZWbJCj+xRrjVmRxpGtS cvUmgj1LNdXRvBPSvgAeYzKQ0iE2fF2jZAOanSBpz6Rja2u/0jIihPpyzA7AqHa2PGbYISqpr+wn 85kPpz2uSj/gm6x0svysta6u1iHEyuS7GD1uvcqTKzPsMkh/wcpf+0cPnsDaJOYIJd4MvQ0Nf4Ec bvHO1diDanHV9KirFAIIecsyj+YNTo81GR+jDXFsz/ZQnbbTizvwlZsrbltGiZoUh+c+5pFVgpT1 b21uZVckGp0ccOa2XhkGbmsUNPNkere1WheNsVXmhO8BzADiszcWR9bz/SDtF09JpSbuuCeiIDK0 C/0=
X-Report-Abuse-To: spam@quarantine11.antispamcloud.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/xjVpccBKceuIRz-SWJLWZGFYGEw>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Oct 2020 05:08:18 -0000

On 10/22/2020 8:17 PM, Spencer Dawkins at IETF wrote:
> Hi, Martin,
>
> Thanks for this. I have a question and a couple of comments.
>
> On Thu, Oct 22, 2020, 19:06 Martin Thomson <mt@lowentropy.net
> <mailto:mt@lowentropy.net>> wrote:
>
>     (I put a variation of this comment in the meeting and in slack,
>     but I wanted to expand on it some.  Sorry, but this got long. 
>     Four hours is not enough sleep.)
>
>     Multipath seems pretty clearly useful for certain cases.  I think
>     that the meeting today answered at least the first two of the BoF
>     questions I posed earlier on the list.  So if we are to regard
>     this as a BoF, it meet its goals (thanks chairs).  There is some
>     uncertainty about the first question about having a clear problem
>     to solve, but I am of the view that we could muddle through with
>     some combination of either ignoring our differences or working
>     around them.  The third question regarding constituency is where I
>     didn't find a satisfactory answer.  I want to be clear though,
>     this is no fault of the proponents.  At the current time, I am
>     convinced that formally starting work on multipath would be unwise.
>
>
> I've been assuming that targeting Experimental would allow people who
> care about multipath to work on it as part of the QUIC community
> without derailing needed standards-track work, rather than working in
> isolation or in a splinter group. At a minimum, the chairs could give
> priority to needed standards-track work when that's required to make
> progress.
>
> Does that make sense?

As Martin writes later in his report, anyone can use the extensibility
mechanisms of QUIC and design a multipath extension. I would expect to
see several of those. If applications really demand that, then some of
those extensions will be deployed, experience will be gained, etc. At
that point it might be time to pick draft a standard proposal based on
these prototype deployments.

>
>     Multipath aims to improve performance either through latency,
>     robustness, or throughput.  Application awareness and involvement
>     in scheduling seemed to be the key factor that enables finding the
>     optimal usage pattern or scheduling algorithm that allow multipath
>     to deliver on those goals.  Applications and users are in the best
>     place to balance goals against other factors like cost or whatever
>     else matters most.  (For reference, I recall the same point being
>     made by Roberto and Christian most clearly, but several others
>     made the same point.)  Christoph did a good job of showing how
>     this applies to very specific use cases, and I thought I saw that
>     in the Alibaba presentation also, but we didn't quite get enough
>     time to get the necessary detail in either presentation.  One
>     potential advantage in this regard is that QUIC implementations
>     are often closer to applications, so they might be in a good
>     position to integrate better.
>
>
> One of the things I came away with today was an appreciation for at
> least two kinds of applications - the kind that can do better if they
> handle multiple paths themselves because the details of the
> application matters so much, and the kind that don't care as much
> about - if you're using multiple paths simultaneously just to make use
> of all your bandwidth, that's a lot easier to delegate to a general
> purpose transport mechanism.

There are very few applications that "use multiple paths simultaneously
just to make use of all your bandwidth". Christopher told us for example
how Siri and Apple Music use very different strategies, one to minimize
latency in a low bandwidth application, the other to minimize use of the
expensive cellular link unless that's required to avoid stalls.
Multipath can easily go to "win some, lose some" trade-offs, an in
particular trade-offs between bandwidth and latency, or bandwidth an
cost of operation.

>
>     However, many of the cases that were presented were exactly the
>     sorts of opaque intermediation that is almost the antithesis of
>     that ideal.  Similarly, David's assertion that multipath is
>     orthogonal to MASQUE is reliant on the assumption that application
>     involvement is not that important.  In these cases, it's not clear
>     that using multipath is strictly good. 
>
>     I should unpack that a little.  For those people who are making
>     scheduling decisions outside of the endpoint (possible examples
>     being the satellite case and the 3GPP case), it's not clear that
>     this is anything endpoints can prevent.  An endpoint probably
>     can't stop a network provider from using ECMP either.  Similarly,
>     it is not clear how an application endpoint could be aware of
>     these decisions at a level that would allow them to understand and
>     adapt to this treatment.  The result is that these cases have a
>     far more ambiguous value proposition.  Improvements come with
>     trade-offs: for instance, the application might get better
>     throughput, but it comes at a cost to latency.  So I conclude that
>     while these intermediary-based designs might provide an aggregate
>     gain, they will probably not realize the full performance gains
>     that come from end-to-end awareness and control.
>
>     For IETF insiders, see also the BANANA or LOOPS BoFs which were
>     strictly network-based analogues of these.  Many of the same
>     concerns that caused those BoFs to fail apply to those use cases.
>
>     Maybe we accept the application of the protocol to these
>     questionable ends as acceptable collateral if we are able to
>     deploy at the endpoints.  Maybe we allow intermediaries to seek
>     marginal improvements, but try to ensure that we have a clear path
>     to deploying something better in the long term.  But there is a
>     risk that deployment in the network could interact poorly with
>     more-ideal end-to-end solutions and even prevent those deployments.
>
>
> ISTM that this is extremely helpful observation. The presentation I
> gave talked about short-term synergy (can't do multipath QUIC without
> a QUIC stack, right?), but you're raising a really important question
> about opt-in and bypass for intermediaries, that people who care about
> intermediaries should be thinking about. 
>
> At a minimum, transparent interception for QUIC intermediaries will be
> a lot harder than it was for TCP intermediaries ...

One of the priorities in QUIC development was to restore end-to-end
sanity by getting rid of middle-box interference. ATSSS appears to be
yet another design of middle boxes, so you can imagine that there is
very little enthusiasm for that scenario.

>
>     These are systems-level questions that are large in scope and
>     subtle in their effect.  I think that it will require considerable
>     energy to resolve them.  Or, as seems more likely in my
>     experience, it will take more time and effort to design a protocol
>     where there are fundamental disagreements about the nature of the
>     deployment models.
>
>     However, this isn't the only factor.  We are not deciding on the
>     merits and value of multipath in a vacuum.  It was pretty clear
>     that multipath has potential, at least in principle, or in certain
>     cases.  I'm also mostly convinced now that we could produce a
>     design.  There's some uncertainty, but it seems like we could
>     tolerate that.  QUIC definitely wasn't a sure thing when we
>     started out, I can't expect any large effort to be risk free.
>
>     So, with some uncertainty about uses cases, I might still conclude
>     that we have satisfactory answers to the first two BoF questions. 
>     My concern here is about the third: constituency.
>
>     What I think is most important at this point is understanding if
>     this protocol will remain a single, coherent thing.  That we can
>     keep building on the "synergies" that Spencer referred to.  No
>     matter the technical merits of the protocol (it's great!
>     probably!) that synergy is probably the most important feature
>     that this working group has delivered with QUIC.  The details of
>     the protocol matter less than the fact that we have a group of
>     people committed to building and maintaining that protocol.  This
>     working group needs to be the venue where work happens so that
>     this community can continue to build on this success.
>
>     So for multipath, if we take it on, I'd only like to do so if I
>     was convinced that a non-trivial proportion of the active
>     deployments are committed to working on it and deploying the new
>     extension or version.  That is, that this community wants to do
>     the work.  I see no evidence of that yet, which is why I will
>     claim that this fails to satisfactorily answer that third BoF
>     question.
>
>     It is very easy for a splinter group to define a new version of
>     QUIC that does anything.  draft-deconnick or draft-huitema could
>     be the basis of that sort of effort and that could result in the
>     definition of QUIC 84 or 0x0219c81 or whatever.  Call it QUICv2 if
>     you really want.  But if that protocol is only used in certain
>     narrow contexts, then it doesn't produce any of those synergies. 
>     On the contrary, it works to undermine them, so I would prefer to
>     avoid that.
>
>     So rather than ask whether multipath is doable, I think we need to
>     instead decide what the QUIC working group - the group that built
>     the core protocol - is doing next for that core protocol and the
>     deployments that depend on it.  Personally, I don't think that
>     we're ready for another large project.  We need deployment
>     experience with the protocol.  We also need to go in and backfill
>     those pieces of QUIC we need for the next thing, like version
>     negotiation.  For me, that's more than enough.
>
We had a discussion recently on Slack, comparing the size of the early
prototype implementations, maybe 3,000 LOC, and the size of the current
implementations, often around 30,000 LOC. We discovered and solved a lot
of complex issues during the standardization process, and the increased
code size is an indicator of that additional complexity. Every
additional feature interacts with other features, potentially in a
multiplicative ways. Add multipath, and expect the code size to grow
significantly yet again. Hence the call for caution.

>
>     I've now seen a lot of enthusiasm for the idea of multipath. 
>     There were some great presentations with convincing use cases. 
>     There might be too much diversity in use cases or a schism in
>     approaches, but we probably could, with sufficient energy,
>     overcome that.  However, I have to conclude that this is not a
>     good time for starting that work.
>
>     I realize that this is likely unsatisfactory to those who want
>     multipath.  I also recognize that deferring work when there is
>     such clear demand could result in that demand manifesting in a
>     bunch of non-interoperable protocols.  Those are risks that we
>     each have to assess for ourselves.
>
>     This will change over time.  I don't know how long it will take. 
>     But it's not now.
>
Yes. I don't see how we can have a generic multipath solution now, not
without having some clear consensus from application developers about
the features that they want.

-- Christian Huitema