Multi-path QUIC Extension Experiments

Alexis Norech <alexisnorech@gmail.com> Mon, 19 July 2021 19:15 UTC

Return-Path: <alexisnorech@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 08A663A3BDB for <quic@ietfa.amsl.com>; Mon, 19 Jul 2021 12:15:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zaxinb4gLUtT for <quic@ietfa.amsl.com>; Mon, 19 Jul 2021 12:15:40 -0700 (PDT)
Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 294C03A3BD7 for <quic@ietf.org>; Mon, 19 Jul 2021 12:15:39 -0700 (PDT)
Received: by mail-ej1-x62c.google.com with SMTP id hd33so30483569ejc.9 for <quic@ietf.org>; Mon, 19 Jul 2021 12:15:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Pkbqbg5oAG36Qnu8fDNyw8BjzGA/m8yK3Osm45Dw/as=; b=i/lF6Zi2siWHjIV8T7ceKi5CEXfmV6jnhkAv+ic2fQTnkyS0kblicr3dCJ+g5m+4Ki zcbKxrJuRfyBlZomibJ1xZDjULsSFGFNX7shrCv5Syq554tUJbDaRzhtkR3hAQ6sD10o MawotABGmqSTn3vPa5PsdylyKFMGixHvtDdm+tEVPU6x+KWX5qyhAm/UDUuEPX0sydlg XRKAp0jF7WIQORANpmb9wmqUf7hgQX1SLuCWFJ8R370WPPu6Bx4BmcbQ4txc1UYsUShA gq8o80TdIzGrgY6tnx6P1mRmyb/z2885Qh9uzKiLj5uRb7uerfBCiJU1GXRb2EPBPcJ7 NFVg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Pkbqbg5oAG36Qnu8fDNyw8BjzGA/m8yK3Osm45Dw/as=; b=LfXLSWWkjH0zzTvb+idAmmxDUVpfiXZPSyLxlerbm6Lg1VKtL8z57880iLWyxda6Kr OsrT822H3saDpqVElzkHaqpD1wrgkMBqQB0xsmqa59tmaG3w6xQy2kjqpwwhRKhLVwN3 olKof06e3YyeUl87ztdWX7H3rnAdES99v4AHIee0WzSDbJPoDcIeXv0jomXkxjVlHDs8 hsXdE4Lb9rK2mMDOKkRantSFDSQzOvtykPuCDnJmkI8gPP/ujrZjUvJ8B1xbwDmCGWsO uBRUx+O/QQ7XC/VK6bbeF2osRZ8PMe4xVLouTDeApugteDQ1z8Uj/AiXf0pqEglJTIfF I9Kg==
X-Gm-Message-State: AOAM530e8sbvNJacw8v61olCVX13ZlLyb7BTMfPpunNiOtqnnbyN/3B2 rBd8QRA+FC0kWSHAr75HOfahHKAZ7O0Ko91b2wQ=
X-Google-Smtp-Source: ABdhPJxbNZYU4v17okhXZPYPIN7evn1FwR2jensx9CaS2Yu1HzOXb0pi4Sj++nVv9t//afIkVobWX8vYeUkuKwotnqs=
X-Received: by 2002:a17:907:1c21:: with SMTP id nc33mr27978277ejc.436.1626722136452; Mon, 19 Jul 2021 12:15:36 -0700 (PDT)
MIME-Version: 1.0
References: <8C2E8EFB-756B-449B-84E0-11CD6B57E541@ericsson.com> <0334A48E-B6C6-464C-A48C-4512A453DA81@fb.com> <CAPhuoz0vz2k63_ZaWmUg_XgSHUopid7vf+JY=JVFm_VqQJY87w@mail.gmail.com> <CAHgerOGhX3G_aBMrwZ0zXjN8tu9dqtu-9tu4z7YU80qfqaZkzQ@mail.gmail.com> <B98E91A9-0E29-44E3-9F0C-06B2DA38DDBD@fb.com> <1987b17d-ba7b-6555-73e4-cd3d9fd4a3c8@huitema.net>
In-Reply-To: <1987b17d-ba7b-6555-73e4-cd3d9fd4a3c8@huitema.net>
From: Alexis Norech <alexisnorech@gmail.com>
Date: Mon, 19 Jul 2021 23:15:25 +0400
Message-ID: <CAFyq4ZR6VSF4v-qTMZ6t0eTMxYb=P99pBOhzp3UwQXumHdkVKg@mail.gmail.com>
Subject: Multi-path QUIC Extension Experiments
To: Roberto Peon <fenix@fb.com>, Yunfei Ma <yfmascgy@gmail.com>, Charles 'Buck' Krasic <charles.krasic@gmail.com>, Mirja Kuehlewind <mirja.kuehlewind@ericsson.com>, Christian Huitema <huitema@huitema.net>
Cc: Yunfei Ma <yunfei.ma@alibaba-inc.com>, "matt.joras" <matt.joras@gmail.com>, 李振宇 <zyli@ict.ac.cn>, Yanmei Liu <miaoji.lym@alibaba-inc.com>, "lucaspardue.24.7" <lucaspardue.24.7@gmail.com>, quic <quic@ietf.org>, Qing An <anqing.aq@alibaba-inc.com>
Content-Type: multipart/alternative; boundary="0000000000001770cc05c77ec3ce"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/ykwuuFzEDEo-6OzH571-N1B7aG8>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Jul 2021 19:34:18 -0000

 In the best case, where the data is sent successfully with minimal packet
loss, using multipath is quite good, and having an overhead should probably
be avoided. In these cases, send copies on both paths is truly not a
solution.

On the other hand, in the worst case, you need to detect and handle the
packet loss, and waiting for the data in the other path will surely cause
more delay than just sending the same data with only a single "good" path.

Resending the "bad path" packets to the "good path" looks like a solution,
but that's considering that the traffic is stable, which may not be the case
if we are in this "bad" case. That would just append packets to a queue,
and would ultimately create another bottleneck if the "good path" become
a "bad path" itself.

The network will ultimately be unstable in some cases, so it would be best
to not cause too much overhead in the paths where the traffic is already
unstable.


- A. Cheron

On 7/19/2021 21:30 +0400, Christian Huitema <huitema@huitema.net> wrote:

Yunfei, Yanmei, Qing and their colleagues are pointing a very important
issue in multipath protocol design: in some cases, using multiple paths
results in worse quality than using a single path. It is rather
important to delineate exactly how that happens, and to understand why
cooperation between transport and application is required to solve the
issue.

For simplification, let's assume a basic "equal cost multipath"
scenario, in which the transport splits the traffic between two paths of
equal delay and equal capacity. When both paths are working well, data
arrives twice faster than if a single path was used, and everyone is
happy. But of course, things do not keep working well all the time,
which is when we will see bad results. For example:

* if one path experiences packet losses and the other does not, some
packets sent on the lossy path will have to be transmitted again. This
will cause delays, and may well cause "head of line blocking".

* if one path suddenly slows down, maybe because of radio issues, it
will take some time for congestion control algorithms to detect the drop
in capacity. Queues will build up during that time, which again may well
cause HoB.

One extreme solution is to always send copies of the data on both paths,
and let the application use whichever copy arrives first. That works,
but at the cost of massive overhead. We don't want to incur that
overhead all the time. I like the suggestion of using application state
to drive when the transport should use this kind of redundancy, and when
it should not.

-- Christian Huitema

On 7/18/2021 2:51 PM, Roberto Peon wrote:

It sounds like this problem is not inherent to single-connection
multi-path, but will be present in any multi-path implementation, including
multiple-tcp-connections used with application-layer muxing.
If this is correct, then it isn’t really a ‘QUIC problem, but rather an
implementation/scheduling/CC problem.
That isn’t saying that it isn’t interesting or important to solve, but
rather that the protocol itself need not change to solve the problem for
generic-QUIC transport.

H3, OTOH, may suffer from this at L7+ proxies without some changes to QUIC
or H3, but that is a much longer conversation that doesn’t require
multi-path to happen.

-=R

From: Yunfei Ma <yfmascgy@gmail.com>
Date: Sunday, July 18, 2021 at 1:17 AM
To: Charles 'Buck' Krasic <charles.krasic@gmail.com>, Mirja Kuehlewind <
mirja.kuehlewind@ericsson.com>, Roberto Peon <fenix@fb.com>
Cc: "matt.joras" <matt.joras@gmail.com>, 李振宇 <zyli@ict.ac.cn>, Christian
Huitema <huitema@huitema.net>, Yanmei Liu <miaoji.lym@alibaba-inc.com>,
"lucaspardue.24.7" <lucaspardue.24.7@gmail.com>, quic <quic@ietf.org>, Qing
An <anqing.aq@alibaba-inc.com>, Yunfei Ma <yunfei.ma@alibaba-inc.com>
Subject: Re: Multi-path QUIC Extension Experiments

Hi Charles, Roberto, and Mirja:

Thanks a lot for your questions. As all three of you are curious about the
definition of MP-HoL, I am putting my answer into one reply.

Short answer: the MP-HoL is not because of flow control, but rather, it is
related to the nature of path heterogeneity. In other words, MP-HoL can
happen when flow control limit is not reached (as pointed out by Charles,
you can set a large limit on the client side).

More specifically, when you want to send out packets on different paths at
the same time, there is a scheduler to decide how to split your packets and
put them on different paths. However, in mobile networks, the network paths
could have very different path delays. MP-HoL blocking arises when the
packets sent earlier at the slow path arrive later than the packets sent
later at the fast path, causing out-of-order arrival. As a consequence, the
out-of-order packets are not eligible to be submitted to applications, so
the fast path has to wait.

For example, say we want to send out two packets that belong to the same
video frame with a min-RTT scheduler, which is default in MPTCP. For each
packet, the scheduler selects a path for that packet to transmit. The
selection has two criterias: (1) the path's congestion window is not full
and (2) the path selected has a smaller RTT than the other. If somehow, at
the moment of transmitting, the fast path's cwnd is full (some traffic has
been sent before), the first packet is then put on the slow path by the
scheduler. Later, an ACK is received and the fast path becomes available,
so the scheduler puts the second packet on the fast path. As a result,
there is an out-of-order arrival.

What makes the problem even more difficult is that in mobile networks, the
RTTs can change quickly, which makes accurate prediction very difficult.
Worst case is that when the scheduler thinks it is using the fast path, it
is actually using the slow path instead. As you can see, in order to make
multi-path transport efficient, it is important to solve this problem and
that's what we are doing in this project .

I hope I have answered your questions. If not, please let me know.

Cheers,
Yunfei



On Fri, Jul 16, 2021 at 12:51 PM Charles 'Buck' Krasic <
charles.krasic@gmail.com<mailto:charles.krasic@gmail.com>> wrote:
"don't overcommit" includes the common practice of setting very large
limits on the client side, where in aggregate the case of server being flow
control limited is effectively non-existent.

I am curious to hear clarification of the precise definition of MP-HoL
blocking here. is it not flow control, but rather path aliasing where
distinct paths are actually sharing some physical link(s)?

On Fri, Jul 16, 2021 at 12:13 PM Roberto Peon <fenix=40fb.com@dmarc.ietf.org
<mailto:40fb.com@dmarc.ietf.org>> wrote:
I too am curious!
There are only two ways to handle flow control—overcommit, or don’t
overcommit.

The “don’t overcommit” choice leads to blocking, since any of that resource
allocated to one path can’t be used by the other.
The “overcommit” choice either leads to OOM, or throwing out some
successfully transmitted and received data.

Underlying this is a fun question: Which inefficiency is worse? Not using
resources that should be used (i.e. from choosing to not overcommit), or
sometimes redundantly using a resource (from choosing to overcommit)?
I’m curious too about what implementation strategies we end up doing in
general around this, and.. if enough implementations are choosing
overcommit, if we need some different protocol mechanisms to bound the
redundancy?
-=R

From: QUIC <quic-bounces@ietf.org<mailto:quic-bounces@ietf.org>> on behalf
of Mirja Kuehlewind <mirja.kuehlewind=40ericsson.com@dmarc.ietf.org<mailto:
40ericsson.com@dmarc.ietf.org>>
Date: Friday, July 16, 2021 at 6:15 AM
To: "Ma, Yunfei" <yunfei.ma<http://yunfei.ma>=
40alibaba-inc.com@dmarc.ietf.org<mailto:40alibaba-inc.com@dmarc.ietf.org>>,
Robin MARX <robin.marx@uhasselt.be<mailto:robin.marx@uhasselt.be>>, Yanmei
Liu <miaoji.lym@alibaba-inc.com<mailto:miaoji.lym@alibaba-inc.com>>
Cc: "matt.joras" <matt.joras@gmail.com<mailto:matt.joras@gmail.com>>, 李振宇 <
zyli@ict.ac.cn<mailto:zyli@ict.ac.cn>>, Christian Huitema <
huitema@huitema.net<mailto:huitema@huitema.net>>, "lucaspardue.24.7" <
lucaspardue.24.7@gmail.com<mailto:lucaspardue.24.7@gmail.com>>, quic <
quic@ietf.org<mailto:quic@ietf.org>>, Qing An <anqing.aq@alibaba-inc.com
<mailto:anqing.aq@alibaba-inc.com>>
Subject: Re: Multi-path QUIC Extension Experiments

Hi Yunfei,

thanks as well for you sharing your results! Can you explain even a bit
more what you mean by MP-HoL Blocking? Is this because of the flow control
limits? If so wouldn’t it make sense to reserve a certain “space” for each
path?

Mirja


From: QUIC <quic-bounces@ietf.org<mailto:quic-bounces@ietf.org>> on behalf
of "Ma, Yunfei" <yunfei.ma<http://yunfei.ma>=
40alibaba-inc.com@dmarc.ietf.org<mailto:40alibaba-inc.com@dmarc.ietf.org>>
Date: Thursday, 15. July 2021 at 04:18
To: Robin MARX <robin.marx@uhasselt.be<mailto:robin.marx@uhasselt.be>>,
Yanmei Liu <miaoji.lym@alibaba-inc.com<mailto:miaoji.lym@alibaba-inc.com>>
Cc: "matt.joras" <matt.joras@gmail.com<mailto:matt.joras@gmail.com>>, 李振宇 <
zyli@ict.ac.cn<mailto:zyli@ict.ac.cn>>, Christian Huitema <
huitema@huitema.net<mailto:huitema@huitema.net>>, "lucaspardue.24.7" <
lucaspardue.24.7@gmail.com<mailto:lucaspardue.24.7@gmail.com>>, quic <
quic@ietf.org<mailto:quic@ietf.org>>, Qing An <anqing.aq@alibaba-inc.com
<mailto:anqing.aq@alibaba-inc.com>>
Subject: Re: Re: Multi-path QUIC Extension Experiments

Hi Robin,

Thanks so much for your questions!

First, the head of line blocking discussed here is called multi-path
head-of-line blocking or MP-HoL blocking, and its root cause is quite
different from the stream HoL blocking usually discussed in QUICv1. The
MP-HoL blocking happens when one path blocks the other path, not when one
stream blocks the other stream. Please note that we indeed use multiple
streams, for example, different video requests are carried in different
QUIC streams. QUIC’s stream multiplexing ability and its benefits still
hold in this scenario.

Second, regarding packet scheduling mode, right now, in our Taobao A/B
test, we transmit packets on multiple paths simultaneously. However, you
can definitely use traffic switching only and choose to switch when one
path could not meet your bandwidth requirement. Basically, if you use
multiple paths simultaneously, you get the most elasticity from a resource
pooling perspective. It really comes down on what your application needs.
We will also update the packet scheduling section soon in a newer version
of the draft, in which we plan to include more discussions on the packet
scheduling policy.

Third, regarding the benefits of more bandwith versus the "downsides".
Whether you want more bandwidth depends on your application. For videos,
yes, more bandwidth is extremely helpful in improving the long tail QoE,
which is an important target for Taobao. We find multi-path QUIC helps us
improve two important metrics, rebuffer rate and video start-up delays. In
the past, if you work on multi-path scheduling that does not collaborate
close enough with applications such as MPTCP, the MP-HoL blocking becomes
the downside that cripples the performance. However, the user space nature
of QUIC provides us the opportunity to solve this problem, so now our
conclusion is that you can enjoy the benefits of more bandwidth and more
reliable connectivity from multi-path without much of the “downsides”.

I hope my answer is helpful, but feel free to let me know if you have any
additional comments.

Cheers,
Yunfei

from Alimail macOS<
https://protect2.fireeye.com/v1/url?k=7cc82aa7-2353138a-7cc86a3c-8692dc8284cb-e08a325a5c75cf95&q=1&e=de295b4f-9105-4e32-980f-779c711eaa62&u=https://mail.alibaba-inc.com/
>
------------------Original Mail ------------------
Sender:Robin MARX <robin.marx@uhasselt.be<mailto:robin.marx@uhasselt.be>>
Send Date:Wed Jul 14 07:39:37 2021
Recipients:Yanmei Liu <miaoji.lym@alibaba-inc.com<mailto:
miaoji.lym@alibaba-inc.com>>
CC:quic <quic@ietf.org<mailto:quic@ietf.org>>, Ma, Yunfei <
yunfei.ma@alibaba-inc.com<mailto:yunfei.ma@alibaba-inc.com>>, Christian
Huitema <huitema@huitema.net<mailto:huitema@huitema.net>>, Qing An <
anqing.aq@alibaba-inc.com<mailto:anqing.aq@alibaba-inc.com>>, 李振宇 <
zyli@ict.ac.cn<mailto:zyli@ict.ac.cn>>, matt.joras <matt.joras@gmail.com
<mailto:matt.joras@gmail.com>>, lucaspardue.24.7 <lucaspardue.24.7@gmail.com
<mailto:lucaspardue.24.7@gmail.com>>
Subject:Re: Multi-path QUIC Extension Experiments
Hello Yanmei,

Thanks for the additional results on an interesting topic. I'm looking
forward to reading the SIGCOMM paper.

I was a bit surprised to (apparently) see HOL blocking mentioned as a major
issue, as that's one of the things QUIC aims to be better at than TCP.
It's a bit difficult to understand from the slides, but it seems like
you're sending packets for a single stream (Stream ID 1 in the diagrams) on
both the slow and fast path, which would indeed induce HOL blocking.
Consequently, I was wondering what the practical reasons are for you to
multiplex packets for a single stream over multiple paths, as opposed to
for example attaching a single stream to a single path (say: high priority
streams use the fast path for all their packets).

I see this mentioned a bit in the draft under "packet scheduling", where it
talks about switching paths once the cwnd is full for one. That indeed
leads to the behaviour seen in the slides, but that's my question: why
would you take those approaches then?
Are there so many cases where the additional "bandwidth" from using
multiple path's cwnd for a single stream outweigh the downsides of HOL
blocking? Relatedly: what are the packet loss rates you've observed on real
networks?
Have you experimented with e.g., tying streams to paths more closely? Does
that work better or worse? Why?

I'm mainly wondering how these tradeoffs evolve depending on the type of
paths available and if it's possible to make a model to drive this logic.
I assume there is much existing work on this for MPTCP, but I also assume
some of that changes due to QUIC's independent streams / stream
prioritization flexibility.

Thank you in advance and with best regards,
Robin


On Sun, 11 Jul 2021 at 20:48, Yanmei Liu <miaoji.lym=
40alibaba-inc.com@dmarc.ietf.org<mailto:40alibaba-inc.com@dmarc.ietf.org>>
wrote:
Hi everyone,

We have finished some experiments about deploying multi-path quic extension(
https://datatracker.ietf.org/doc/draft-liu-multipath-quic/)<
https://datatracker.ietf.org/doc/draft-liu-multipath-quic/)> in Alibaba
Taobao short-form video streaming, and the experiment results are concluded
in the slides (attached file).
If anyone is interested in the experimental details about multi-path quic,
please let us know.
All the feedbacks and suggestions are appreciated!

Best regards,
Yanmei


--

dr. Robin Marx
Postdoc researcher - Web protocols
Expertise centre for Digital Media

Cellphone +32(0)497 72 86 94

www.uhasselt.be<
https://protect2.fireeye.com/v1/url?k=37557dd4-68ce44f9-37553d4f-8692dc8284cb-fe608437d16ed9d9&q=1&e=de295b4f-9105-4e32-980f-779c711eaa62&u=http://www.uhasselt.be/
>
Universiteit Hasselt - Campus Diepenbeek
Agoralaan Gebouw D - B-3590 Diepenbeek
Kantoor EDM-2.05

Error! Filename not specified.