Re: [tsvwg] signaling packet importance [was Re: New Version Notification for draft-herbert-fast]

Tom Herbert <tom@herbertland.com> Fri, 11 August 2023 15:08 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8046BC151531 for <tsvwg@ietfa.amsl.com>; Fri, 11 Aug 2023 08:08:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.106
X-Spam-Level:
X-Spam-Status: No, score=-2.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CIHTIBhPzxE6 for <tsvwg@ietfa.amsl.com>; Fri, 11 Aug 2023 08:08:10 -0700 (PDT)
Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D71BFC151530 for <tsvwg@ietf.org>; Fri, 11 Aug 2023 08:08:10 -0700 (PDT)
Received: by mail-pg1-x532.google.com with SMTP id 41be03b00d2f7-563f752774fso1311364a12.1 for <tsvwg@ietf.org>; Fri, 11 Aug 2023 08:08:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland.com; s=google; t=1691766490; x=1692371290; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WmXKhUkamOjqBebJ0O4WtKXAYea00xgplLGGWkWml8M=; b=VhACwC5dQLzNuRMH/rM/3vScAPCmXD8KEu6KQNWnE4xshXuRQRZRlW93nvdiMuBlvi 2RpGD/VmAsYnPe41dyqUGhNUvxzR4vnO8nDsBWwFws9vBEnVealGfu7u44A8ZHq0huxp MMc1EnJgcBraqvXXF8jLFjRdSgSmCkwpK8Ymv2j/BIbgHIXFoooYG8+Hc6O+tfi4mthX /EX+byA5QGq79uuUMKMSCTcqNkziG/3+gOayns4Qef8v6M0siHor2g51ZtRe1TBnyLAx p9t06kQ5+e3L0uFK+vYhFBwS49mKQQyn9GeHvLycqMvkf5AVt8zEcS2vioQ7rO4Ym6+v YsAQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691766490; x=1692371290; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WmXKhUkamOjqBebJ0O4WtKXAYea00xgplLGGWkWml8M=; b=Xe5HlvGV0updc0c5d+GAHZ81C07JqaWLCzhvbyBLjc61EgZ/CnQViu+j0X3K19OHUO uKhmnM98b2GlP26wtuNxQkWPYPoMUnSvjew39PBKo3zaTn94iToHBb0RbexmqHl8T2fa 4WRXyP1q04kL8Wh6RdUWOWVHMHaXwvSiMh9mtJpPMb9nf/aYElMe9FcpbhrvPnZ7V2yx GQLhxAqCAdwTJwffYXS1s7lSeq3db0a8at1aD/UGxUdUQ5Qh8vWSdeasJz1T8tqJQLg1 kMfYM1kv9oN9anCA8m6azctkbbJmSSTBmnxDkfD+HyLo+8x6YweTBUnQ1bw4WpWOBZHI qgmA==
X-Gm-Message-State: AOJu0YwV+Y794IU69/rsze2TyIgH88kkM6SwQYg9tSof7baEhsqgImy8 2W+Px6QtnowFvW9vhn/Jb2bFZwDi4GV5hL77ZNPd9Q==
X-Google-Smtp-Source: AGHT+IEozB0VMcYPoHVpNhj4JFMJqpBl5X6qfNAF818k+b6XEedzbUfxk7ADqBbH3R2I3Hym0jX4rk37zdT0LdWw9N8=
X-Received: by 2002:a17:90b:70a:b0:268:ce03:e17e with SMTP id s10-20020a17090b070a00b00268ce03e17emr1494501pjz.47.1691766489566; Fri, 11 Aug 2023 08:08:09 -0700 (PDT)
MIME-Version: 1.0
References: <5014A95B-C4CC-40DE-8CC7-4503D438E7F4@gmail.com> <CALx6S340SWJNOgj17aYF7_ij1ygj3szv6TGnSAe+GU3aqOLT6g@mail.gmail.com> <43A0C8F8-AD8A-41DE-9006-9D48AF7A522D@gmx.de>
In-Reply-To: <43A0C8F8-AD8A-41DE-9006-9D48AF7A522D@gmx.de>
From: Tom Herbert <tom@herbertland.com>
Date: Fri, 11 Aug 2023 08:07:57 -0700
Message-ID: <CALx6S36VBN6BBVxEcsFrQ4CO4dnp4Y1MjwUTHZyFhrVm18cuVg@mail.gmail.com>
To: Sebastian Moeller <moeller0@gmx.de>
Cc: Dan Wing <danwing@gmail.com>, tsvwg <tsvwg@ietf.org>, Sri Gundavelli <sgundave@cisco.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/_jrS0ghP9ZTVWZpv4pisTv2aE9I>
Subject: Re: [tsvwg] signaling packet importance [was Re: New Version Notification for draft-herbert-fast]
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Aug 2023 15:08:14 -0000

Hi Sebastian,

Thanks for the comments! Please see replies inline below.


On Fri, Aug 11, 2023 at 5:09 AM Sebastian Moeller <moeller0@gmx.de> wrote:
>
> Hi Tom,
>
> more below as [SM]
>
> > On Aug 10, 2023, at 01:36, Tom Herbert <tom=40herbertland.com@dmarc.ietf.org> wrote:
> >
> >
> >
> > On Wed, Aug 9, 2023, 2:22 PM Dan Wing <danwing@gmail.com> wrote:
> > Yes, there is a lot of interest in marking 'important' packets.  Thanks for your comments on the list and at the meeting and thanks for updating FAST.  Thanks to Mike for being a nudnik and poking all the co-authors earlier this week, too.
> >
> > Hi Dan,
> >
> > Thanks for the great discussion! Comments in line.
> >
> >
> >
> >
> > Regarding inapplicability of UDP trailers to TCP, TCP has head of line blocking so marking a packet within a TCP packet differently from other packets in that same flow provides no benefit to the flow.
> >
> > Not necessarily. If a host sees a connection is having problems, like seeing retransmission or high variance in RTT, it's their prerogative to try to affect routing to fix the problem. For instance, this is implemented in Linux as a feature where the IPv6 flow label can be changed mid-flow to see if there's a better route. We can do the same thing for signaling in HBH to affect QoS. While we don't want to be sending every packet differently, the ability to change things occasionally for a TCP connection could be very beneficial.
>
>         [SM] This assumes/tests that the network path evaluates the flow label for making routing decisions? Do you have any (even anecdotal) numbers how often changing the flow label of an established connection actually helps getting another path and if there are side-effects like getting the connection terminated?

I believe this is used by some hyperscalers to good effect.
Originally, the feature was enabled by default, but this was breaking
some stateful firewalls that have to see all packets of the flow so
it's now opt-in (of course, these same stateful firewalls would break
on any routing change :-) ).

>
>
> >
> > A *separate* TCP flow could get marked differently (e.g., a file transfer TCP flow versus an interactive audio/video TCP flow), but that differential marking is only possible with multiple TCP connections -- causing its drawbacks of NAPT port consumption, connection set-up time, and per-connection congestion control.  Those drawbacks were some of the justification for creating QUIC, and resolved by QUIC.
> >
> > Right, connection setup/teardown is expensive which is why there's incentive to try to save failing connections as I described above.
> >
> >
> > The problem user "A" will mark all their packets as important starving user "B" (the "everyone is an ambulance" problem) requires some sort of permission (or quota or "admission") system -- like FAST.  But that is a problem no matter if IPv6 HbH option, UDP option, DSCP, RSVP, etc.
> >
> > Yep, that's a fundamental problem in host network signaling! This also where the ticket analogy is helpful, if concert tickets didn't have features to prevent forgery then everyone would be sitting in the front row for the Taylor Swift concert :-)
>
>         [SM] This is why the network by default should only look at information that is likely essential for flow's to achieve their goal. So using a 2,3, or 5-tupel for "flow" identification seems safer than including the flow label for IPv6... with the obvious solution mentined already that there needs to be a trust domain for non-essential information if it is supposed to affect netwirk path behavior.

The problem is that the network using 5-tuple isn't reliable or
robust. To do this network devices need to be able to parse the
transport layer to get the port numbers which isn't always possible.
Some packets don't have port numbers or hide them (e.g. IPsec), some
packets disassociate port numbers from the actual transport flow (e.g.
QUIC), some packets have headers between the IP header and transport
header which makes it harder and sometime infeasible to for devices to
parse the transport layer (e.g. extension headers).

>
> >
> >
> >
> > Tom Herbert wrote:
> >
> >> Can you comment on the requirements and pragmatics of network devices
> >> to process UDP Options. As I mentioned, high performance network
> >> devices process protocol _headers_, it's going to be difficult to
> >> teach them to process trailers efficiently.
> >
> > The IP header length and UDP header length are examined to determine if there is a UDP trailer.  Examining the trailer can be helped by encoding "backwards" to ease parsing as was done by SRTP EKT, https://datatracker.ietf.org/doc/html/rfc8870#section-4.1.
> >
> > But I believe that's an E2E protocol. Do you have any examples of protocols with trailers that are explicitly intended to be processed by intermediate nodes?
> >
> > Middle boxes, especially hardware implementation, aren't designed to handle trailers.
>
>         [SM] Not 100% sure whether it counts, but ATM/AAAL5 used a trailer, albeit a trailer at a predictable position (last 8 octets of the last ATM cell of a "packet"). I guess the issue is less "trailer" per se and more "badly predictable" position?
>
>
>
> > They may have to defer processing of UDP Options to a software, which as we've seen, time and time, again dooms a data path protocol to the bit bucket! Hosts probably can process trailers, but it's still not nearly as efficient as just processing headers; and host devices with programmable datapath, SmartNICs like P4 programmable programambaility, probably won't be able to process trailers without some major hardware changes. UDP Options might be functional on the Internet, but it's yet to be proven they can be implemented to be sufficiently performant.
> >
> >
> > I agree UDP trailers are not headers, and I agree network devices are quite used to dealing with headers.
> >
> > The differentiated packet handling would occur near the edge of the network close to the subscriber.  In the case of Wi-Fi it occurs on the subscriber's premises; in the case of 5G it is the radio access network.  Those are where queues build up (due to congestion caused by other users) and where radio interference occurs.  Those devices can improve the user experience by dropping less-important packets in favor of more-important packets.  Today, those devices have little-to-no idea which packets are important.  They know TCP packets will eventually get re-transmitted, and they can make guesses that UDP/443 is carrying QUIC and -- today -- has TCP-like behavior and will also re-transmit packets. But as unreliable QUIC [RFC9221] becomes A Thing, the existing treatment of QUIC by those devices will cause harm.  I believe it is use-cases for unreliable QUIC that is what is causing the influx of new IETF proposals.  Of course, (S)RTP has long wanted this sort of behavior, too, long prior to unreliable QUIC -- it's just been less interesting as the likes of Zoom, Cisco WebEx, Teams, and Skype have tried to quietly just get their job done.  But they, too, could benefit from UDP trailers.
> >
> > Reliable protocols need QOS just like any other protocols. Just relying on TCP or QUIC retransmissions would set an awfully low bar for QoS! Customers want latency/throughput/jitter guarantees. And if someone is seeing a lot of retransmissions, that's more likely to be a bad thing than a good thing and some action is needed. In E2E protocols, only end hosts have the state to know what's happening to connections, but it's really the network where the problems are and where things can be fixed (like trying to influence routing like described above). That's exactly why we need host network signaling-- the host has the E2E state and understanding of application requirements; the network has the mechanisms and services to satisfy those requirements.
>
>         [SM] There might also be some desire for information transfer from network to the end-nodes, e.g. Arslan, Serhat, and Nick McKeown. ‘Switches Know the Exact Amount of Congestion’. In Proceedings of the 2019 Workshop on Buffer Sizing, 1–6, 2019. makes the point that networks should convey the buffer occupancy in a ~4 bit number per packet so end-points can base their decisions on better insight what is happening in the network.

IOAM provides network to host signaling for that purpose (RFC9179). It
also uses Hop-by-Hop options (draft-ietf-ippm-ioam-ipv6-options) so
the arguments about survivability will inevitably be raised.
Conceivably, network options in the UDP surplus area could contain
IOAM options, but that would require that intermediate nodes can
modify bytes in the UDP surplus area which would necessitate major
changes to the UDP Options draft (for instance, OCS handling would
need to be modified).

>
> >
> >
> >> If we use fragment options
> >> to force the UDP Options into headers then the problem becomes that we
> >> have to change all end hosts to support that. There's also potential
> >> problems mixing transport and network options, and in particular this
> >> could affect security and DoS considerations.
> >
> > Changing OS stacks is already necessary for UDP trailers:  the user- and kernel-mode interfaces to UDP don't provide the flexibility to add a UDP trailer (on Windows); instead, raw sockets are necessary.  Once the entire packet is built by hand with raw sockets, fragment options become viable.  That said, I agree we don't want to make changes to host stacks -- and is one thing I don't like about UDP trailers!
> >
> > Yes, frankly that has long been a concern of mine wrt UDP Options. We haven't seen a lot of running code or deployment. I believe there is a FreeBSD patch, but I haven't seen patches sent to netdev for Linux. Neither do I know of any active deployment of UDP Options. And this concern is only about the current use case of transport options, using UDP Options for network consumption hasn't even been brought up until recently. This situation can be contrasted to a protocol like Segment Routing that referenced at least ten implementations, had Linux patches accepted upstream, and I believe there was significant deployment when SR was published as RFC-- see https://datatracker.ietf.org/doc/html/draft-matsushima-spring-srv6-deployment-status-15 for a great example of demonstaring the "running code" requirement of IETF;
>
>         [SM] This, while desirable, seems not to be a requirement in tsvwg.

IETF is "Rough consensus and running code". A good example of meeting
the "running code" requirement in tsvwg is when the TCP initial
congestion window was increased to 10 (RFC6928). Before the RFC was
even published the change was already supported to multiple OSes and
it was deployed across all of Google's internal and Internet facing
servers. When we presented the data, one of the comments from the AD
was that they had never seen a protocol change in IETF supported by so
much data! While UDP Options are opt-in, like segment routing in that
regard, they still represent a major change to a core Internet
protocol. And as someone said on the list early on, the bar is high
for such a change.

>
>
>
> > it is quite thorough in describing the implementation, deployment, and interoperability status of segment routing (I really wish there was an equivalent for UDP Options especially if their scope is being expanded to be processed by network devices).
> >
> >
> >
> > On survivability of UDP trailers:  the sender can probe to determine if UDP trailers survive a path, and avoid sending them the UDP trailer causes failure to receive the packet.  This works because the UDP trailer is an optimization to attempt to help the network deliver improved service.  We have do similar probing for IPv6/IPv4 (Happy Eyeballs), TCP/QUIC, and a slew of other protocols as they are pushed out.  Such probing works fine for UDP trailers because they are unlikely to be dropped on the Internet (that is, on transit networks) because security devices and UDP checksum validation does not occur on such transit networks.
> > Rather, networks close to the subscriber is where security devices and checksum validation might occur and those networks close to the subscriber are the very same networks where packet priority is more interesting because it's the same place queues build up and packet importance would influence dropping less-important packets.  This creates alignment of the network operator wanting the benefit of UDP trailer (to improve user experience) who can also influence their security vendor and their UDP checksum validation device to permit UDP trailers.
> >
> > That same logic can be applied to HBH Options. The ability of operators to offer differentiated services and monetize them, should be sufficient incentive (alignment ) to fix their bugs and make host to network signaling work HBH Options, UDP Options, etc.
> >
> >
> > The "don't care about UDP trailers" behavior of transit networks contrasts from IPv6 HbH options where the transit networks are, today, interfering with IPv6 HbH options, and do not have an aligned incentive to change their IPv6 HbH operations.
> >
> > Probing can be done for HBH Options as well (see draft). There are some differences. UDP Options probing is for establishing feasibility across the path as well as the end host, HBH options probing and Happy Eyeballs are really only probing the path. One likely difference in practice is that UDP Options probing probably needs to be done on a 4-tuple basis because the end host might have an Anycast address that is behind a load balancer (might be a minor point, but will require testing and consideration to implement a robust probing algorithm).
> >
> > Without any deployment of UDP Options and not a lot of implementations, I'm quite skeptical of any statement that UDP Options are any more survivable than HBH Options. We don't really know. And, even if it were true today, there's a very real possibility the providers will block them if they sense the slightest security issue, especially if they can't parse these because they are trailers (i.e. trailers might be perceived to be a covert channel to firewalls that can't process trailers).
>
>         [SM] If I understand correctly, intermediate nodes are permitted to delete Hop-by-hop options? If so that would be one point in favor of UDP options (for the discussed use-cases I think both end-points should know about what signal was sent to the network, if only to help in debugging).

No, intermediate nodes are not allowed to delete Hop-by-Hop options.
They can only modify them per the modify bit in each option (RFC8200).
There was a proposal in 6man to allow intermediate nodes to delete
Hop-by-Hop options header, but that got a lot of push back.

>
> Regards
>         Sebastian
>
>
> >
> > Tom
> >
> >
> >
> >
> > -d
> >
>