Re: [tsvwg] Challenges for host to network signaling via UDP Options

"C. M. Heard" <heard@pobox.com> Sat, 29 July 2023 21:12 UTC

Return-Path: <heard@pobox.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B9D78C14CEFC for <tsvwg@ietfa.amsl.com>; Sat, 29 Jul 2023 14:12:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.106
X-Spam-Level:
X-Spam-Status: No, score=-7.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=pobox.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QnXaUxMai1Vy for <tsvwg@ietfa.amsl.com>; Sat, 29 Jul 2023 14:12:54 -0700 (PDT)
Received: from pb-smtp2.pobox.com (pb-smtp2.pobox.com [64.147.108.71]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0F8F6C14CF05 for <tsvwg@ietf.org>; Sat, 29 Jul 2023 14:12:53 -0700 (PDT)
Received: from pb-smtp2.pobox.com (unknown [127.0.0.1]) by pb-smtp2.pobox.com (Postfix) with ESMTP id 865CF192633 for <tsvwg@ietf.org>; Sat, 29 Jul 2023 17:12:50 -0400 (EDT) (envelope-from heard@pobox.com)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=pobox.com; h= mime-version:references:in-reply-to:from:date:message-id:subject :to:cc:content-type; s=sasl; bh=hufItqyFHTlplg3Bc7QqX3x1F8J86U15 zDqq9gaV9EA=; b=Kz/M9wPcFXXs9kjx7cD0NuW+oISJZtNAyuxlAA2Fa32+ACdZ xBLYPzBG7OP3ojkT59ldg93qyO60Z5x00sYKO2F0ClHlZLsdcx20Q3k8phfjilyw uiduFUkW6BsHdocfeYb+Og7kzHfwxSI+A3nZ3QqWnOMq5Yv6OfhsHO1tyzY=
Received: from pb-smtp2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-smtp2.pobox.com (Postfix) with ESMTP id 7CF80192632 for <tsvwg@ietf.org>; Sat, 29 Jul 2023 17:12:50 -0400 (EDT) (envelope-from heard@pobox.com)
Received: from mail-wr1-f41.google.com (unknown [209.85.221.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pb-smtp2.pobox.com (Postfix) with ESMTPSA id DC580192631 for <tsvwg@ietf.org>; Sat, 29 Jul 2023 17:12:49 -0400 (EDT) (envelope-from heard@pobox.com)
Received: by mail-wr1-f41.google.com with SMTP id ffacd0b85a97d-3178dd81ac4so1811018f8f.3 for <tsvwg@ietf.org>; Sat, 29 Jul 2023 14:12:49 -0700 (PDT)
X-Gm-Message-State: ABy/qLZdRlfQv+zSOGtOsHTJxHFJW60Q1AkpKIavs55cgXyFQy0J2PYC fZjtdYGGGTRmUINagP2Di5JQv+iiNB7GD4rsVhs=
X-Google-Smtp-Source: APBJJlG0L3WY200TZiP/xmq3efTXnEgGzVKrBSCCcT2BJyTTErDaslLSm8lv+ZUfAtY4BZNWGR//R0bWociq0aNZHPQ=
X-Received: by 2002:a5d:456b:0:b0:314:468d:ccab with SMTP id a11-20020a5d456b000000b00314468dccabmr5074056wrc.45.1690665168912; Sat, 29 Jul 2023 14:12:48 -0700 (PDT)
MIME-Version: 1.0
References: <CALx6S35mpz12dUtWSPLiK4pvi4S+qacjhiL=Wa6=QiZkVjj-fw@mail.gmail.com>
In-Reply-To: <CALx6S35mpz12dUtWSPLiK4pvi4S+qacjhiL=Wa6=QiZkVjj-fw@mail.gmail.com>
From: "C. M. Heard" <heard@pobox.com>
Date: Sat, 29 Jul 2023 14:12:36 -0700
X-Gmail-Original-Message-ID: <CACL_3VF4MDcR+4nQssWe6LmKZdaxVATyf6efFcqzjq5QKMR4Cw@mail.gmail.com>
Message-ID: <CACL_3VF4MDcR+4nQssWe6LmKZdaxVATyf6efFcqzjq5QKMR4Cw@mail.gmail.com>
To: Tom Herbert <tom@herbertland.com>
Cc: tsvwg <tsvwg@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000d3ce390601a6a8f0"
X-Pobox-Relay-ID: AC600E18-2E54-11EE-BEFD-307A8E0A682E-06080547!pb-smtp2.pobox.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/SpcVd6EB1Zi6FUhhyn2-o6nxuq4>
Subject: Re: [tsvwg] Challenges for host to network signaling via UDP Options
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Jul 2023 21:12:58 -0000

On Fri, Jul 28, 2023 at 10:49 AM Tom Herbert wrote:
> In the tsvwg meeting there were two proposals for using UDP Options to
> allow hosts to signal the network
> (draft-kaippallimalil-tsvwg-media-hdr-wireless and
> draft-reddy-tsvwg-explcit-signal). While this is an interesting and
> creative approach, as I mentioned at the mic, I fear this path will be
> fraught with challenges. I'd like to elaborate on those challenges.

Thank you for starting this conversation. I had intended to do
something along the same lines because of my deep ambivalence
regarding the proposals to use "UDP as the new IP," as someone
else said at the mic.

> Some of the fundamental challenges I see are:
>
> 1) UDP Options may not be deployable on the Internet (or any more
> deployable than HBH Options)
>
> The ostensible advantage of using UDP Options, or more generally UDP
> surplus space, compared to Hop-by-Hop Options, is that they are to be
> less likely to be dropped by intermediate nodes. I say this is
> ostensible since we really don't yet any deployment data on UDP
> Options that confirms the assertion, and I believe that even if they
> pass through the network today, there is a real risk that network
> providers will block them tomorrow if they sense they are a potential
> problem or risk-- it is an explicit policy of some providers to only
> allow a minimum set of protocols that they believe are
> useful and acceptable.

We do not have any deployments of UDP Options, but we DO have some
evidence, from November 2018, that packets with a non-empty UDP
surplus space have a reasonably good (but by no means perfect) chance
of passing through middleboxes in the general Internet, under certain
circumstances. See:

https://tma.ifip.org/2020/wp-content/uploads/sites/9/2020/06/tma2020-camera-paper70.pdf

for the results of the 2018 study by Zullo, Jones, and Fairhurst.

The principal finding was that that the most common reason for
packets with a non-empty surplus area being dropped was that
middleboxes often attempt to validate the UDP checksum, but
frequently do so incorrectly, by using the IP Payload Length
instead of the UDP Length both for the length value in the
pseudo-header and for the number of bytes included in the checksum.
Using an OCS that is specifically designed to compensate for
this anomaly brings the drop rate from being unacceptably bad
to at least fair (over 70% at worst to better than 90% at best).
This can be seen in the attached excerpt from Figure 4 of that paper.
[image: image.png]
The study found other reasons why surplus areas were dropped. For
example: "We contacted the manufacturer of one device that blocked
UDP-O. They confirmed that their middleboxes performed a consistency
check between the IP and UDP length, along with other integrity checks
on datagrams and discarded them in the case of a length mismatch."
That behaviour totally blocks UDP Options, and it's not unreasonable
that it would still be found in certain places. On the other hand,
the traversal rates found in this study are dramatically better than
that found for IPv6 Hop-by-Hop options in other studies. For example,

https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf

reports a 99.5% HbH drop rate in February 2022.

So, while UDP Options may end up being undeployable owing to factors
that either were not considered in the 2018 study or that have changed
since then or will change in the future, the best available evidence
does suggest to me that for the near future at least they have a better
chance of working in the open Internet than IPv6 Hop-by-Hop options.

For that reason, I have some sympathy for the "UDP as the new IP" point
of view, despite my general antipathy for putting things into transport
protocols that are designed for middleboxes to exploit. I'll have more
to say on this in my reply to #4 below.

> 2) UDP Options are not amenable to efficient implementation,
> especially in hardware
[...]
> A common technique used in router hardware implementation is "parsing
> buffers". [...]  The protocol processing logic reads data in the
> parsing buffer to make routing and filtering decisions (it's like a
> data cache for the headers).
[...]
> UDP Options are expressly protocol trailers which means they will
> almost never fit into parsing buffers which only contain headers. To
> process them in hardware, without resorting to the dreaded software
> slow path, will require some drastic hardware changes to many hardware
> implementations. It's going to be difficult for vendors to justify the
> cost to provide that support.

That, precisely, was why I made the following comment on
draft-kaippallimalil-tsvwg-media-hdr-wireless:

   There is a way around this, which is to send the option as a
   per-UDP-fragment option, but that renders the packet incapable of
   being processed by an endpoint that does not understand UDP options.
   On the other hand, if this works for your use case, and nothing else
   does, it would provide a raison d'être for a vexing feature in the
   UDP options specification whose utility some of us have questioned.

I should point out that -- to the best of my knowledge -- the 2018
study cited above did not test the case UDP Length == 8, where no
conventional UDP data is present. It's possible that this case would
be treated differently than UDP Length > 8. Perhaps one of the authors
could comment on that.

Again, I'll have more to say on this in my reply to #4 below.

> 3) Robustly identifying packets that contain UDP Options as opposed to
> something else in the surplus area is going to be difficult in
> intermediate nodes.
>
> There was never a prohibition for people not to use the UDP surplus
> area and there is no codepoint in the UDP header that indicates the
> protocol of the surplus area. This makes it difficult to elicit a
> robust protocol that uses UDP Options.

Given that there are no known uses of the surplus area, this point
does not greatly concern me, at least for options that are intended
to be used only by the end hosts. Granted, saying that the surplus
area has no known uses does not mean that it is known to be unused.

> The problem of identifying UDP Options is significantly harder at
> network nodes than end hosts:
>
> First, I believe device vendors will push back if they are required to
> calculate checksums over the surplus area; this is especially true if
> there is no limit to the size of the surplus area (for example, it we
> might convince them that doing a checksum over max 60 bytes of IPv4
> header is okay, but checksumming as 1000 bytes of surplus area may be
> a hard sell).
>
> Trying to discriminate protocols based on port numbers, or some sort
> of connection tracking with UDP, is not robust and really doesn't work
> at intermediate nodes-- from RFC7605:  "It is important to recognize
> that any interpretation of port numbers -- except at the endpoints --
> may be incorrect". So if UDP Options are processed by intermediate
> nodes, we really need UDP Options to somehow be self-identifying.
>
> An alternative to the checksum that I have proposed is to embed a
> magic number that identifies the surplus area as being UDP Options.
> The use of a magic number was proposed in SPUD to allow intermediate
> nodes to identify SPUD in the UDP payload, the same technique could be
> used in the surplus area.

A magic number could be used, but it would have to be IN ADDITION TO
the OCS, not instead of it, owing to known middlebox traversal issues.

> 4) UDP Options are defined to be transport options, where network
> signaling is more aptly described as network options. Mixing them
> together creates problems.

YES!!! Therein lies a possible solution to a long-festering problem.

Recall that in a comment above that I described per-UDP-fragment
options as "a vexing feature." The reason that I call them "vexing"
is that the spec leaves it to one's imagination how much control
the application or upper layer protocol is supposed to have in
specifying them and, more to the point, how their presence and
values are supposed to be communicated to the application or
upper layer protocol. Given that the interface to the application
or upper layer protocol is in principle a per-datagram and not a
per-fragment interface, what to do is not always obvious. To me,
this makes a clean "bump in the stack" design for UDP fragmentation
and reassembly hard to achieve.

The same quandary does not occur for per-datagram options. For
those, the UDP Options logic is expected to reassemble all
fragments into a complete datagram, which then undergoes further
processing to extract the user data and process any options in the
surplus area. The latter, except for unrecognized UNSAFE options,
are passed to the application or upper layer protocol, which takes
such action on them as it chooses; unrecognized UNSAFE options cause
the reassembled datagram to be discarded without further action. This
works the same as for per-datagram options in an unfragmented datagram.
In fairness, it is only as simplified in

https://mailarchive.ietf.org/arch/msg/tsvwg/0RptobbqczCVoOy-B8L_CJaj25w/

that the current fragmentation and reassembly scheme achieves
precisely this; but the proposed simplification is consistent with
the -18 draft.

> When an intermediate node inspects UDP Options it is searching for
> network options. If a UDP Options list contains transport options the
> intermediate node has no interest in them and it should ignore them--
> but ignoring options isn't free!

And the converse problem afflicts an end node that implements UDP
fragmentation and reassembly as a "bump in the stack" sublayer.

The idea that comes to mind, then, is this:

- network layer options are always transmitted as per-fragment
  options, where they are in a header that is relatively easy
  for intermediate systems to parse

- in analogy to IPv4 or IPv6, make a convention that only the
  per-fragment options on the terminal (or initial) fragment
  are passed to the user (if they are passed up at all; that
  may depend on the specific option)

- transport layer options are always transmitted in the surplus
  area of the reassembled datagram

The two types of options would not mixed. That would require
designating each option as one or the other, which does not fit
well with the concept of UDP Options as a framework. But that
idea has already run into problems with AUTH and UENC.

> This problem could be solved if the UDP surplus was allowed to carry
> other protocols than UDP Options and allowed to carry multiple such
> protocols in a chain. This could be accomplished with a type field in
> a fixed header in the surplus area, and using extension headers as the
> valid types. Network options could be contained in their own extension
> header and so intermediate nodes only need to consider that and can
> ignore everything else in the surplus space (to simplify this, the
> network options extension header would be required to be the first
> header in the surplus area). The basic extension header types defined
> in RFC8200 could even be used in this manner with little change: HBH
> Options to carry network options, Frag header would work for
> fragmentation, Destination Options could be used to carry other UDP
> Options, etc.

I'm not so sure that I would want to go so far as replicating the
IPv6 extension header "salami slices" -- that, after all, has not
proven to be very successful in the real world. But changes to the
design of the FRAG option to make this idea work more smoothly may
well be worth considering. For example, we could change "SHOULD come
as early as possible in the UDP options list" to a MUST. Other, more
substantive changes come to mind, but I won't dwell on them here. What
I'd rather do is ask the WG to consider whether UDP options should, or
should not, cater to the proposals that seek to use "UDP as the new IP"
because IPv6 Hop-by-Hop options don't work in the real world.

Thanks,

Mike Heard