Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-options-13.txt

Joseph Touch <touch@strayalpha.com> Thu, 08 July 2021 05:25 UTC

Content-Type: multipart/alternative; boundary="Apple-Mail=_EBB759C6-7A17-4114-8274-F3826CA335E2"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.22\))
From: Joseph Touch <touch@strayalpha.com>
In-Reply-To: <CACL_3VHC55cdu96=5OuNKmaaXrvDY5wkYid9a+j6=VtQrvJhZg@mail.gmail.com>
Date: Wed, 07 Jul 2021 22:25:30 -0700
Cc: TSVWG <tsvwg@ietf.org>
Message-Id: <5086F1C2-55C9-4BD4-BB80-9C247E379204@strayalpha.com>
References: <162408795080.21706.5548660195641640175@ietfa.amsl.com> <C2C396E7-B728-496E-841B-D9F64004D3E3@strayalpha.com> <CACL_3VHC55cdu96=5OuNKmaaXrvDY5wkYid9a+j6=VtQrvJhZg@mail.gmail.com>
To: "C. M. Heard" <heard@pobox.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/ST4ww9qr1VbVMqFrnOaNvm2zXP0>
Subject: Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-options-13.txt
Precedence: list

Hi, Mike,

> On Jun 26, 2021, at 10:13 PM, C. M. Heard <heard@pobox.com> wrote:
> 
> On Sat, Jun 19, 2021 at 12:39 AM Joseph Touch wrote:
> Here’s a summary - I tried to catch both everything Mike provided feedback on as well as what I have proposed as a way forward.
> 
> This is much appreciated, thank you. My item-by-item responses follow.
> 
> AFAICT, there are also two things not yet addressed, in addition to ongoing debate on the changes below:
> How many frags MUST be supported 2? 4?
> My position is still that the best way forward for general adoption of the spec is not to require anything beyond the ability to accept a datagram consisting of a
> single UDP fragment (one that is both initial and terminal). I provide a brief rationale in https://mailarchive.ietf.org/arch/msg/tsvwg/dUOTDlF4x6sVijtNGhhO5JJx7RA/ <https://mailarchive.ietf.org/arch/msg/tsvwg/dUOTDlF4x6sVijtNGhhO5JJx7RA/>.

I provided a rationale in my post to your other email just now; basically, IPv6 already requires reassembly of 1280B fragments to at least 1500B, so we can too. I don’t think it’s too much to add that capability to UDP, given it’s a requirement of another stateless protocol that even IOT devices need to support (to use IPv6).
> Should we change UNSAFE from a single code point to a range (but not a flag), and if so, what range of codepoints is sufficient (32? 16?)
>  
> I can live with the current approach, but I would much prefer a range of 32 codepoints. If the current approach is retained, I would like to suggest making the description of ENCR a separate subsection so that it appears in the table of contents (it would also be good to get it in the master list of options in Section 5).

I’ll wait to see how the WG feels (anyone else?). At a minimum, yes - ENCR can be in a separate subsection (I’ll add that to my list of pending updates).
> Frag 
> Drops integrated checksum
> Includes a start pointer so all per-frag options can come before the frag data
> Terminal includes an endpointer so options can come after
> This new format means that per-frag options always come before all FRAG data, which enables zero-copy
>  
> The specification as it stands in -13 will work, but I am not convinced that it is the best solution we could have.
> 
> 1.  One of the stated motivations is to support zero-copy reassembly, by allowing a NIC to DMA the fragment data to host memory. This implies that the NIC in question would need to be able to do header-data split. However, in -13 the combination of OCS + FRAG that precedes the fragment data is of variable length, being different for terminal and non-terminal fragments. Not all NICs implement header-data split for variable length headers (see, e.g., pp. 8-10 of https://netdevconf.info/0x14/pub/slides/62/Implementing%20TCP%20RX%20zero%20copy.pdf <https://netdevconf.info/0x14/pub/slides/62/Implementing%20TCP%20RX%20zero%20copy.pdf>). Moreover, I am not convinced that this is an important use case; the conventional format, not using fragments at all, serves the data streaming applications that Joe has in mind just as well.

I appreciate that zero-copy systems may need to adapt. However, if they can’t handle variable length headers, they can’t do zero-copy in most cases either (TCP already has this ‘feature’).

> 2. I question the need to allow for every option to be specified as being allowed both per-fragment and per-datagram; most apply to one ot the other (https://mailarchive.ietf.org/arch/msg/tsvwg/LF9rUkkqkz9gNqya36SW39XgliE/ <https://mailarchive.ietf.org/arch/msg/tsvwg/LF9rUkkqkz9gNqya36SW39XgliE/>), and thus I question the need to have options both before and after the payload data in a terminal fragment.

I disagree; I don’t see a need to indicate how options are used, given they are user (application) controlled anyway. The situation is distinct from IP, in which the headers are copied. In UDP, the use of options per-fragment is not “copied” from the original; they are added on a per-fragment basis.

Further, we already have options in both places. For FRAGs, options are before the fragment - that’s the nature of how the user data fragment is hidden so it is not visible to legacy receivers. However, once (safely) reassembled, they can - and IMO should - appear exactly as they would if fragmentation had not occurred. In that case, UDP options would *fllow* the datagram. 

IMO, that’s why the pre-fragment and post-datagram format is not only appropriate, but the best way to behave the way the system does in the absence of fragments.

> This is actually a relatively new feature, having been introduced (implicitly) in -09 without discussion on the list;

Actually, it goes back further than that; we only described it explicitly since -09. It has been an artifact of option processing ever since the introduction of the combined FRAG/LITE approach (which has been just FRAG since -09).

> in previous versions (-08 and earlier), options other than OCS were ignored except in the terminal fragment and were applied to the reassembled packet. At a minimum, this deserves a thorough discussion by the WG.

In previous versions, it was only a SHOULD that only OCS could precede FRAG/LITE.

> 3. By sandwiching the payload data in between other options, the FRAG format in -13 complicates generic checksum offload for encapsulated protocols. Granted that this is relevant only for the case of a self-contained fragment that is both initial and terminal fragment; but that case is important for enabling unsafe options.

I don’t see how this affects checksum offload, even for a self-contained fragment. OCS in each frag always goes to the end of the packet. In all FRAGs, UDP checksum stops at the UDP header. 

> Given the above, I continue to advocate for the format proposed in https://mailarchive.ietf.org/arch/msg/tsvwg/Wv--BLVMPAX6g5umok9BQsXAEGg/ <https://mailarchive.ietf.org/arch/msg/tsvwg/Wv--BLVMPAX6g5umok9BQsXAEGg/>, with the change of including a fixed length of 8 and using two codepoints to distinguish terminal and non-terminal fragments.

The only way both options could be 8 is to remove the possibility of post-frag options, or to somehow indicate the delineation between pre-frag and post-frag options some other way. I see no good reason to create a solution to this problem, vs. using the parallelism that a reassembled UDP datagram with options appears *exactly* as it would without reassembly, i.e., with UDP options after the user data.

> Finally, there is a lingering editorial matter: the following paragraph crept in during the change from -08 to -09, and it should be removed because (a) the first sentence is wrong and (b) the rest is redundant with the 2nd paragraph of the section.
> 
>    The Fragmentation option (FRAG) combines properties of IP
>    fragmentation and the UDP Lite transport protocol [RFC3828 <https://datatracker.ietf.org/doc/html/rfc3828>]. FRAG
>    provides transport-layer fragmentation and reassembly in which each
>    fragment includes a copy of the same UDP transport ports, enabling
>    the fragments to traverse Network Address (and port) Translation
>    (NAT) devices, in contrast to the behavior of IP fragments. FRAG
>    also allows the UDP checksum to cover only a prefix of the UDP data
>    payload, to avoid repeated checksums of data prior to reassembly.

That paragraph should have been removed; I missed it in this pass. I’ll add it to the pending updates.

> 
> OCS
> Uses the standard 2-byte prefix (all but NOP and EOL do)
> Added discussion of RFC 6935 regarding exception to requiring the UDP checksum and thus OCS
> Allow OCS’s checksum to be precomputed, but still check in the order options occur
> Occurs over the entire surplus area (doesn’t stop at EOL)
> OK on most points, but I strongly object to the following when OCS is mandatory (as is the case when UDP CS <> 0):
> 
>    Note that a receiver can compute the OCS checksum before processing
>    any UDP options, but that computation would assume OCS is used and
>    would not be verified until the OCS option is interpreted.
> 
> If I perform the computation before processing any options, and OCS is mandatory, it would be foolish of me to waste any more effort seeking the OCS option in the TLV chain. I already know that the option area is invalid,

How exactly do you know this? If OCS is mandatory, you need to find it (seek it in the TLV chain) in order to know whether the value you compute differs from the value in the option.

Performing the computation early just allows offloading if desired; the value computed still needs to be checked.

> and I should not try to parse potentially invalid TLVs. It is both unsafe and a waste of cycles. 

You can’t know when OCS is wrong until you find it in the TLV.

> Additionally, the following paragraph, while an improvement over the -12 version, is still not good enough when OCS is optional:
> 
>    >> When present, the OCS SHOULD occur as early as possible, preceded
>    by only NOP options for alignment.
> 
> When UDP CS == 0, either that SHOULD needs to be changed to a MUST, or the receiver needs to be excused from bothering to check OCS. I must once again question the wisdom of ever parsing a list of TLVs that are not covered by a checksum (and I think IPv6 made quite a big mistake to do so with its options).

I don’t see why SHOULD isn’t sufficient. First, it’s obvious that there could be utility in NOPs preceding OCS. Second, we don’t know what other options a user or implementer might want to occur first. You have to parse the TLV one way or the other to find OCS - even if it’s first. If it’s not, you still either find it or you don’t. Yes, earlier is better - but not MUST better. I see nothing that will not work if OCS is not first so no rationale for forcing this.

> I'd like to bring up another point that came to my attention during an off-list discussion with Tom Herbert, namely that optional checksums save no work for an implementation that has generic checksum offload; in fact, they create more work by creating more special cases. He makes the point -- convincingly, as far as I am concerned -- that optional checksums are largely a relic of the past. My preference would be to see support for UDP CS == 0 be an option that consenting endpoints may use but that are not required to be implemented.

I don’t understand what that would mean, but this argument appears to ignore the most important current use case - tunnels. Tunnel systems do not necessarily rely on offloading hardware because the packet processing occurs after the packet has already passed the interface, in many cases.

> Finally, the next-to-last paragraph still has a typo:
> 
>    As a reminder, use of the UDP checksum is optional when the UDP
>    checksum is zero. When not used, the OCS is assumed to be "correct"
>    for the purpose of accepting UDP packets at a receiver (see Section
>    7).
> 
> It should say:
> 
>    As a reminder, use of the OCS is optional when the UDP
>    checksum is zero. When not used, the OCS is assumed to be "correct"
>    for the purpose of accepting UDP packets at a receiver (see Section
>    7).

Yes, will fix.

> NOP
> Increased max in a row from three to  seven
> 
> OK
> 
> AE
> Split into AUTH (safe) and ENCR (UNSAFE)
> AUTH can depend on option data
> ENCR can depend on but not modify option data
> Only one is ever used at a time, though (that’s one reason it was presented as AE before)
> 
> OK modulo editorial comment above about ENCR
> 
> EOL
> MUST set as zero on transmit, MAY check on receipt, but MUST ignore otherwise.
> post-EOL always were covered by OCS and still are
> 
> OK
> 
> UDP Length vs extended length
> Always use the smallest format, as Mike suggested
> 
> OK with one question. The spec now says:
> 
>    >> Options using the extended option format MUST indicate extended
>    lengths of 255 or higher; smaller extended length values MUST be
>    treated as an error.
> 
> What do I do if someone sends me an option that has the extended format but with a length between 4 and 254? Do I skip over just the offending option, or do I drop the entire option area?

It might be easier to say SHOULD use the smallest format (and clearly has no choice but to use the larger formate for 255 and larger, but there’s no good reason now to prohibit these values. I think the only error is if the smaller one is below 4, which is an invalid length. Any time an invalid length is seen, option processing should cease, as per the quoted text below (from Sec 4).
> Option length errors
> Corrected to fail only on nonsensical values, otherwise skip as unknown
> 
> Section 4 still says:
> 
>    >> Option Lengths (or Extended Lengths, where applicable) smaller
>    than the minimum for the corresponding Kind and default format MUST
>    be treated as an error. Such errors call into question the remainder
>    of the option area and thus MUST result in all UDP options being
>    silently discarded.
> 
> In  order to be consistent, the words "Kind and default" should be removed from the first sentence.

Yes for default; I think Kind is still useful to retain. I.e., if you know the Kind and the length given is too small, then stop.

> Also: would it be appropriate to make an exception for FRAG? Normally the whole option area is discarded if a FRAG option is seen when there is conventional UDP data; does the rule for wrong length override that? Note that FRAG is an unsafe option; if it weren't in the base specification, we would insist that it be dropped if unrecognized.

I don’t see an exception for FRAG. It would mean the entire option area would be discarded. Given FRAG only appears (now) when there is no user data, that means the packet is basically discarded.

> Also for OCS; the assumption that it is based on the Internet checksum is pretty much baked in; if I see something with a different length, I don't know how to validate the option checksum, and so I would be inclined to drop the options.

Yes. Same issue, same result AFAICT.

> ACS
> Silently ignored if failed except if configured otherwise
> Unknown lengths treated same as bad checksum
> 
> OK
>  
> EXP
> Showed extended length format
> 
> OK
>  
> UNSAFE
> Noted extended length format also applies
> Cannot modify option data
> 
> OK modulo editorial comment about ENCR above, but I would still prefer to have a range of values.
>  
> UDP-lite
> Removed implied equivalence to FRAG, but retained remainder  as useful context
> 
> OK
> 
> Thanks and regards,
> 
> Mike Heard

[tsvwg] I-D Action: draft-ietf-tsvwg-udp-options-… internet-drafts
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Paul Vixie
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Paul Vixie
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Jonathan Morton
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… C. M. Heard
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Sebastian Moeller
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… C. M. Heard
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… C. M. Heard
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Tom Herbert
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Tom Herbert
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Tom Herbert
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Tom Herbert
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… C. M. Heard
Re: [tsvwg] I-D Action: draft-ietf-tsvwg-udp-opti… Joseph Touch