RE: Packet number spaces in multipath (was Re: What to do about multipath in QUIC)

Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com> Fri, 27 November 2020 08:59 UTC

Return-Path: <mikkelfj@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EB4AE3A1524 for <quic@ietfa.amsl.com>; Fri, 27 Nov 2020 00:59:03 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id REd1G5V9usKC for <quic@ietfa.amsl.com>; Fri, 27 Nov 2020 00:59:01 -0800 (PST)
Received: from mail-yb1-xb36.google.com (mail-yb1-xb36.google.com [IPv6:2607:f8b0:4864:20::b36]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1A3153A0E02 for <quic@ietf.org>; Fri, 27 Nov 2020 00:59:01 -0800 (PST)
Received: by mail-yb1-xb36.google.com with SMTP id x17so3893638ybr.8 for <quic@ietf.org>; Fri, 27 Nov 2020 00:59:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:in-reply-to:references:mime-version:date:message-id:subject:to :cc; bh=9f1KwUx39L3V+gzh8X5SRgQF0KV6dJallcCM2oRLAP4=; b=plB3PhWRxLI4/oLWCSCl+CkjmkwpsnDfIqD8Pz6FoI+YaJvOj9VosAZeNUap3tkc4V OHnH8J71aaRNh5jq81l/QIwJs5yYeZ/lkbDhfdrNjrxivD1z2VTUcczJELdKl/Yx8kTf ddakQX34szQCfbvR1qi2qlk8DtJzZBzFZEhTPaKrQ44v9UQbmGlNqdkxmO0F3BBUMLgt K2FPNZ0wgVlSnHW23U23hhO3IioJBspE06ClFNSrUeDxd5ZciU/rtuzZbA2I64XsL0ei EMj1cmoBdkCBYSMDtvv8vUXRuvNFLJGhfVFBzOIX1Y8qkUwPjhbzpuOdfiEh1TX9UXjK DPxw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:in-reply-to:references:mime-version:date :message-id:subject:to:cc; bh=9f1KwUx39L3V+gzh8X5SRgQF0KV6dJallcCM2oRLAP4=; b=VYABGrvBZ7UUh1McmwZvjhvIjfpXLe+ngz/W5zh0QHFzA708K0uSB1iVmRPLnrJJC7 BCj5huxBfynyPNhqcV3qgFP87M5q+CL3rJPdTXgwRVLIEePloE4hUYawe7rmg4lgEMz5 tj5cN/9Q5tqOsrikd5Oj1uzn4JLKUi88WW2XQXXCxbHIy1LDEyQlYJ5UXronhxlWUNm/ PLG0SBAT/x2F+nfZIydSlKKMch91qWf7rp05tmArl9dfaAEgf5zAE2FdaVxjcf6TY/0+ eMo8vU9PQT1wv73BblG/QBrBdhZf326c1bZjzSzaD+YTjLila62gDkbdt8YIaVRSuL4b JHqw==
X-Gm-Message-State: AOAM5336SJwqYpdY4tmLbzjxau8XOD8V9pSRusOBLkFxfyhJ3Cn5MK3T tgSfGNra3Az9ve2em1MxsMi2vfFJ+NYWT/4eBHM=
X-Google-Smtp-Source: ABdhPJxgfkrSJKfJ9PsB35nOFAJ8WDdpa4rxVCb4ZL0RAHZ+LNnbxyClsSqWBbbmh8y7S1KZ3uJsMUQ8+vmi0cJ1Tjs=
X-Received: by 2002:a25:d24c:: with SMTP id j73mr9658909ybg.489.1606467540277; Fri, 27 Nov 2020 00:59:00 -0800 (PST)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Fri, 27 Nov 2020 00:58:59 -0800
From: Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>
In-Reply-To: <CAN1APddO1SpBn_NOwXGyxNCoengcVio77McWLJLtaceG9n18Cw@mail.gmail.com>
References: <538215d1-3b9e-4784-920d-03be4c3a503a.miaoji.lym@alibaba-inc.com> <CAHgerOGGyAkE=TbCSuTO=T6HK9EM_+m+ASwPRm=o33HBrx7p3Q@mail.gmail.com> <CANatvzz_KSBws_upnx00P7JK=MbgyDRrR5n2VJcr1_=y=P6dfQ@mail.gmail.com> <062fe812-8afb-d946-8336-1f4dc5ebeaaf@uclouvain.be> <7540ef46-9948-c76c-3617-5755be3cdf37@huitema.net> <CANatvzymE+XRXUMBH2quGi=VEUNXDR_Eoer+o6p9+nkD-KFisQ@mail.gmail.com> <3bb7f359-ebe5-7a54-0224-bb1f5f1754af@huitema.net> <CANatvzxyj3nXP+GrnMkexWV-VN7Og4EGXysq1o0W2e2JGWzDrw@mail.gmail.com> <651e0ae1-0a5e-89e9-55c0-c33439599da6@huitema.net> <CANatvzw4Yg9aX2qyaGfc9sS=oEFOHxp-ZLSLF0EYNa8t6uN-iA@mail.gmail.com> <4b96dbb8-e72c-7f99-0bb3-9ee27b7bda78@huitema.net> <CANatvzz_H205MPP67Vnuqp0mwhM0TUbHvA5CfVGeoivCLcUdgw@mail.gmail.com> <850c5bdd-948e-269a-1488-77a77843d5e6@huitema.net> <CACpbDccY3f2wMd5vFzK=NC=Me=EhgmFWMDS7TTBZFtG2bm=JSg@mail.gmail.com> <LEJPR01MB0635984DC5E548E2D7859A4EFAF90@LEJPR01MB0635.DEUPRD01.PROD.OUTLOOK.DE> <2c6150d9-968c-8c8b-af45-505e9529c910@huitema.net> <LEJPR01MB063529B45C6A1CFFCC8B2BB3FAF80@LEJPR01MB0635.DEUPRD01.PROD.OUTLOOK.DE> <CAN1APddO1SpBn_NOwXGyxNCoengcVio77McWLJLtaceG9n18Cw@mail.gmail.com>
MIME-Version: 1.0
Date: Fri, 27 Nov 2020 00:58:59 -0800
Message-ID: <CAN1APdcoGoGweq4rPaYvGuWLXziVgphVsmj36uNZY71ZPfRB-A@mail.gmail.com>
Subject: RE: Packet number spaces in multipath (was Re: What to do about multipath in QUIC)
To: jri.ietf@gmail.com, huitema@huitema.net, markus.amend@telekom.de
Cc: quic@ietf.org, dirk.von-hugo@telekom.de, kazuhooku@gmail.com
Content-Type: multipart/alternative; boundary="00000000000014db6405b512df58"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/lhfpyvjW6YZhw6rS8S7JxjpK9OE>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2020 08:59:04 -0000

I should perhaps clarify that QUIC does have packet numbers which may or
may not be path specific in a multipath scenario. However, QUIC does not
associate semantics with the packet numbering scheme with the exception
that unacked sufficiently old packet numbers can be considered lost because
it is not possible to track history indefinitely, and also because gaps can
be introduced to detect false acknowledgments (optimistic ack attack).
Specifically a multi-core QUIC engine and an endpoint with multiple
hardware offloads, can send packets out of sequence as an optimisation, and
later, the network can of course do the same.

Mikkel

On 27 November 2020 at 09.39.46, Mikkel Fahnøe Jørgensen (mikkelfj@gmail.com)
wrote:

Hi Markus,

QUIC does not have connection sequencing, unlike TCP, that is one of the
key points of QUIC. Only streams carry sequencing.

QUIC Datagram frames do not carry sequencing or even any kind of flow
association. That was discussed, but it was found that the application
could do that at least as well - with the subtle exception of multiple
applications cooperating on the same QUIC connection.

Unreliable streams is something in between where there is stream
association and sequence information, but no delivery guarantees, and
probably no ordering either but the API could choose to offer that at the
end point. Unreliable streams are so far only a concept that has not been
fully explored but are expected to be important for video streaming and
similar use cases.

So you are right that datagram frames can be harder to reason about on
multiple paths, especially if an application makes assumptions based on the
received order. For the QUIC engine, the problem of loss detection is the
same for all QUIC packets and the problem of ordering within a stream gets
more costly due to more reordering, but otherwise remains the same.



Kind Regards,
Mikkel Fahnøe Jørgensen


On 27 November 2020 at 09.20.08, markus.amend@telekom.de (
markus.amend@telekom.de) wrote:

Hi Christian,



OK, good hint. From my understanding that will not change the general issue
of out-of-order reception though.

For clarification, does this means, that a QUIC connection with DATAGRAM
frames will not carry any sequence space or only one, the connection
sequencing? And that using “unreliable” stream frame will provide two
sequence spaces, a connection and a stream space?



Have not found anything about “unreliable” streams in the transport draft,
is this exactly the same as with DATAGRAM, no HoL at all?



*From:* Christian Huitema <huitema@huitema.net>
*Sent:* Donnerstag, 26. November 2020 18:54
*To:* Amend, Markus <Markus.Amend@telekom.de>; jri.ietf@gmail.com
*Cc:* quic@ietf.org; von Hugo, Dirk <Dirk.von-Hugo@telekom.de>;
kazuhooku@gmail.com
*Subject:* Re: Packet number spaces in multipath (was Re: What to do about
multipath in QUIC)



If you want to send unreliable data with sequencing information, it might
be simpler to use STREAM frame in "unreliable" mode than to use Datagram
frames.

On 11/26/2020 3:34 AM, Markus.Amend@telekom.de wrote:

Dear all,



sry for hijacking this conversation. I’m not very familiar with the
different multipath designs for QUIC, however I want to draw attention to
multipath re-ordering which probably becomes important when multipath is
combined with DATAGRAM.



As long as multipath QUIC is operated with strict reliability (similar to
TCP), re-ordering on receiver side is a simple process known from MPTCP.
Introducing unreliable DATAGRAM transmission makes it more challenging on
receiver side to maintain the packet order, because it is not easy to
differentiate between delayed and lost packets. To avoid HoL, a multipath
re-ordering process may benefit from having connection and path sequencing.
In https://tools.ietf.org/html/draft-amend-iccrg-multipath-reordering-01 we
intend to describe this in section 5.6, how fast packet loss detection can
be applied using these different packet sequence spaces. Still the
description is meaningless und will be updated until next IETF, however we
have successfully implemented this approach in a MP-DCCP prototype, which
faces similar challenges in terms of re-ordering. That means, fast packet
loss detection is very beneficial for the receiver re-ordering process to
not lose time until an outstanding packet is assumed lost.





Br



Markus





*From:* QUIC <quic-bounces@ietf.org> <quic-bounces@ietf.org> *On
Behalf Of *Jana
Iyengar
*Sent:* Mittwoch, 25. November 2020 04:35
*To:* Christian Huitema <huitema@huitema.net> <huitema@huitema.net>
*Cc:* IETF QUIC WG <quic@ietf.org> <quic@ietf.org>; Kazuho Oku
<kazuhooku@gmail.com> <kazuhooku@gmail.com>
*Subject:* Packet number spaces in multipath (was Re: What to do about
multipath in QUIC)



(I'm taking Spencer's suggestion to spin off a new thread.)



Christian, Kazuho,



Slowly catching up on this, and apologies if I'm missing anything that was
previously discussed in the centi-thread earlier.



If I understand the design correctly, it makes sense to me, and is very
close to what we had implemented in Chromium a while ago.



Having thought about this problem several times in the past, I'd like to
share a few points that come to mind.



First though, a point on terminology: the receiver maintains a separate
"ReceivedPackets" for each CID, probably for each CID sequence number
(CSN). Let's please not call this a SACK Dashboard, to avoid confusion.



On the question of sending more than 2^32 packets, I think that resetting
the packet number (PN) is ok on new CIDs. I don't see why a sender would
need to maintain continuity across multiple paths anyways, since the CC and
loss recovery contexts are going to be different across paths. A sender
_could_ still maintain these packets in the same "SentPackets" structure if
it wants to, it would need an internal representation of CSN+PN to key off.



ACK Frames:

------------------

Kazuho pointed out that when acknowledging, the ACK frame format should
include CSN. I agree. I would argue for a design where a receiver uses an
ACK frame per CSN (and encodes the CSN explicitly). There are multiple
values for doing this, the primary being that you benefit from compression
when PNs are contiguous within a CSN.



Return Path:

-----------------

There are other ways to decide which return path to send an ACK on this,
but I would propose that a receiver respond on the most recently active
forward path. That is, the path on which a packet was most recently
received. This has the natural effect that a sender that wants to
distribute traffic in a particular way also causes ACKs to be distributed
similarly across the corresponding reverse paths.



RTT measurements:

---------------------------

The return path for ACK frames will impact RTT measurements. That is fine.
It is more important that information reach the sender as soon as possible
than that it should not affect RTT measurements; we can fix the sender
to measure and compensate as necessary. The estimated RTT statistics
reflect the distribution of samples, and if both paths are being used, then
the SmoothedRTT will reflect the expected value based on the traffic
distribution across paths.



That said, it might be useful to track some new stats, especially about how
much later is a "late ack" -- an ACK frame that contains no useful
information -- is received. I'd have to think a bit more about this, but I
think we can devise a stat here. This gives us useful information on the
longest return path, which we can then explicitly use as part of the PTO
computations, to compensate for the fact that the RTT is based on the
shortest return path. (I would _not_ use this stat in the time-based loss
detection timer,  but PTO ought to be fine.)



- jana



On Tue, Nov 17, 2020 at 9:42 AM Christian Huitema <huitema@huitema.net>
wrote:

I have been thinking about variations of that. I think we are making
progress here.

If we follow your design, we get two constraints:

1) That the receive maintains an acknowledgement list based on the CID
through which the packets are received.

2) That the senders guarantee that the same sequence number will not be
used more than once with a specific CID.

The main implementation cost is for receivers. They have to allocate and
maintain a "SACK Dashboard" in the context of each CID that they issue.

Senders have lots of control. For example, the "only once" condition is
also met if a simple sender uses a single number space, as long as it does
not send more than 2^32 packets. That makes the design reasonably easy to
implement for constrained implementations.

Zero length CID are still possible, but that means the receiver supports
only one PN space per sender. Multipath is not impossible, but you end up
managing a single RTT and a single recovery structure. Not very good, but
similar to what happens if multipath is implemented at the IP level.

There is still an issue for coordinating the take down of a path. Suppose
that a client was using both Wi-Fi and LTE, and moves out of Wi-Fi range.
The server will find out eventually that the packets sent to the Wi-Fi path
are never acknowledged, but that may take some time. It would be better if
the client could send a message saying something like "Abandon this path".
That's not the same semantic as "retire this CID". We need a new frame for
that.

"Abandon this path" is an extreme case. There are half-way steps, like
manage the relative priority of a path.