Re: [IPv6] [EXTERNAL] Re: FW: I-D Action: draft-templin-6man-ipid-ext-00.txt

Tom Herbert <tom@herbertland.com> Tue, 12 December 2023 19:38 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 740C4C14F5FA for <ipv6@ietfa.amsl.com>; Tue, 12 Dec 2023 11:38:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.106
X-Spam-Level:
X-Spam-Status: No, score=-2.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WBNhQxWLdXTR for <ipv6@ietfa.amsl.com>; Tue, 12 Dec 2023 11:38:29 -0800 (PST)
Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 25767C14F5F0 for <ipv6@ietf.org>; Tue, 12 Dec 2023 11:38:29 -0800 (PST)
Received: by mail-lf1-x12f.google.com with SMTP id 2adb3069b0e04-50bfd3a5b54so6900969e87.3 for <ipv6@ietf.org>; Tue, 12 Dec 2023 11:38:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland.com; s=google; t=1702409907; x=1703014707; darn=ietf.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3+3AXgqZChzj5spCzxiZnLpnDFdk3wOwVMKW9HD46WA=; b=FTUxFivm0TW3rVlxMFJyAn60A4kWhuPh0Enp8jqfDbAOr0//S0CPjuyycpDLj5lP+n PilPBS16ZzMTCSDyGkB/AaCRURSfAwGB4QME1EQWCCmyUsJsVOfyZ963ugFZ2nD5qsiI PlnVHOPU729+54VpHk4dM30ekwnBOLARDtyWObXCcAvoz2HcCpKmPtP72S+HBdqRWQOc me4Y8s1KX63JctWadE5cfxCRPxPsSmqQg/KIABAtSkl8n5X0dfem6/+Dw1kymGqXKMrh rn4aHxXztNEmkvjGnfmypFJYYB3rh4pC/axY3RGG/o0AOdvRUsW5Hn5m5gZFzcwB4bUR RqrQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702409907; x=1703014707; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3+3AXgqZChzj5spCzxiZnLpnDFdk3wOwVMKW9HD46WA=; b=bYRlwgcgaeNxZO0sWbFUAAFVSt6aNUZWjMjRqspqsuKolOkCyEPbJaIb3QwxErvIj7 QGq0rmdy1CdiSQPi3cXHz4OLLVkYgJsGoOdOjZM9ycpV1I/v+PP41hzbmOu7TrRWpkLI nz/3eTZDiB5YkYbrhl77iy/kXepSdvv15pYOW+qEfbOR3KppIYGU6c5HCHySzosep1FH 2KfSvWvyGcUYHr1yHmwT/4Ana5RU4hpCyUDGDeynobBsU8YKuThQKd02Usns/81xB9ug CmlJaobgI+KbXBYvyoXpq5n3qBz4VywT5kOTI7K0/tDOIYul4Wodznf3jvByN5B5oLP0 o+8g==
X-Gm-Message-State: AOJu0YyaOF0/JKpGqkXRVwqE+hhlvQ8lO4B2XwCzeg9yrcmgjKfkDrLV jqqGXVXZ124f2y0ccKr1WTKgwCB5/py8N4PR3mjqHfu6FEj74PN0
X-Google-Smtp-Source: AGHT+IHOEPx5ROkvQbc3aVB6MnNS26k734GpUhqhzAwsyprTHGwwKT5mkdp0wSev7ikHfHJ2Qr77q5/NUvI1epYoZiU=
X-Received: by 2002:a19:750f:0:b0:50b:f560:a3dd with SMTP id y15-20020a19750f000000b0050bf560a3ddmr3037622lfe.118.1702409906705; Tue, 12 Dec 2023 11:38:26 -0800 (PST)
MIME-Version: 1.0
References: <13091d25c5874d5ba27b2de77d337646@boeing.com> <CALx6S371iasRTW+gzjgCPT1BY-KxZZau2Fu3qGYnoHpiu3o9tQ@mail.gmail.com> <BN0P110MB14205F118B67DD0225A18634A38BA@BN0P110MB1420.NAMP110.PROD.OUTLOOK.COM> <CALx6S36TZqh9h4aZ-o5gkY5Hp1Md2w5gPwpyO4weWeVwqXC5yQ@mail.gmail.com> <c0d3f33b-1193-470a-9f72-2c39dcbacb4f@huitema.net> <BN0P110MB1420A66D481B00EF33487E36A38AA@BN0P110MB1420.NAMP110.PROD.OUTLOOK.COM> <CALx6S36KyazyS3d6GkkvTJr=s9SnA7WT_RJpEztLjmnrQNMrRQ@mail.gmail.com> <BN0P110MB1420BABB15276F252600F998A38AA@BN0P110MB1420.NAMP110.PROD.OUTLOOK.COM> <BN0P110MB142015F7908AF483AB8E5EF2A38FA@BN0P110MB1420.NAMP110.PROD.OUTLOOK.COM> <CALx6S37oj0=A_==oJSHLVsc_k70RkTTJv4Cg--HnDOWKngWMjw@mail.gmail.com> <BN0P110MB14205A729D3DC0DC4E1A9E44A38EA@BN0P110MB1420.NAMP110.PROD.OUTLOOK.COM> <CALx6S36VFpxDgDa72HYsu_7gncncBXxyeM+31EugmC1hBjU7Ug@mail.gmail.com> <BN0P110MB1420706DF834C0736E2FE5DFA38EA@BN0P110MB1420.NAMP110.PROD.OUTLOOK.COM> <CALx6S34N=mcKttAuMc9xge+PkJvo3m6dYaK-BhWNYNB66n3dfw@mail.gmail.com> <BN0P110MB14204E3BA4E96CDC1A4B8E88A38EA@BN0P110MB1420.NAMP110.PROD.OUTLOOK.COM>
In-Reply-To: <BN0P110MB14204E3BA4E96CDC1A4B8E88A38EA@BN0P110MB1420.NAMP110.PROD.OUTLOOK.COM>
From: Tom Herbert <tom@herbertland.com>
Date: Tue, 12 Dec 2023 11:38:14 -0800
Message-ID: <CALx6S34W8=b_fCdn_-h+U9JBnJngVkzO0H+GZNuH9yssfkiKPA@mail.gmail.com>
To: "Templin (US), Fred L" <Fred.L.Templin@boeing.com>
Cc: Christian Huitema <huitema@huitema.net>, IPv6 List <ipv6@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/rop2c9Qj6VrTvQ0fZUzkeMkDxIo>
Subject: Re: [IPv6] [EXTERNAL] Re: FW: I-D Action: draft-templin-6man-ipid-ext-00.txt
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Dec 2023 19:38:33 -0000

On Tue, Dec 12, 2023 at 11:01 AM Templin (US), Fred L
<Fred.L.Templin@boeing.com> wrote:
>
> Tom, I am basing my statements off of the following text in RFC8200:
>
>       "The Per-Fragment headers must consist of the IPv6 header plus any
>       extension headers that must be processed by nodes en route to the
>       destination, that is, all headers up to and including the Routing
>       header if present, else the Hop-by-Hop Options header if present,
>       else no extension headers.
>
>       The Extension headers are all other extension headers that are not
>       included in the Per-Fragment headers part of the packet.  For this
>       purpose, the Encapsulating Security Payload (ESP) is not
>       considered an extension header.  The Upper-Layer header is the
>       first upper-layer header that is not an IPv6 extension header.
>       Examples of upper-layer headers include TCP, UDP, IPv4, IPv6,
>       ICMPv6, and as noted ESP.
>
>       The Fragmentable Part consists of the rest of the packet after the
>       upper-layer header or after any header (i.e., initial IPv6 header
>       or extension header) that contains a Next Header value of No Next
>       Header."
>
> This teaches us that the Per-Fragment headers include "all headers up to and
> including the Routing Header" - it does not permit us to include a Destination
> Options header following the Routing Header as a Per-Fragment header. Or,

Fred,

That omits the fact that the Fragment Header itself is a Per-Fragment
Header also. If fragmentation is defined in DestOpt then a DestOpt
header with one option is effectively being substituted for the
Fragment Header-- so the spirit of the law is maintained.
Alternatively, a new extension header could be explicitly defined for
Extended Fragment Header, but I suspect the bar for defining a new
extension header is going to be higher than the bar for a new
Destination Option.

> if we wanted to do that, we would have to update the above text of RFC8200;
> is that what you are suggesting? Even if we did the update, though, middleboxes
> that inspect per-fragment headers would need to learn that a Destination
> Options header could appear after the Routing Header, and it seems like
> that would be more prone to middlebox filtering than the method of
> including the Destination Option before the Routing Header ?
>
> OR, maybe what you are saying is that, since the Fragment Header will
> not be present, middleboxes will see each fragment as though it were
> a whole packet (with the Destination Options header that has the
> Extended Fragment Header appearing where the Fragment Header
> would normally appear but with Next Header set to 59). Is that what
> you are saying? If so, I think we can simply say that the Destination
> Options header is inserted in the same location that the Fragment
> Header is normally inserted and just leave it at that - what do you think?
>

This is defining an end-to-end protocol, so middlebox considerations
aren't particularly relevant IMO. Setting Net Header to 59 ensures
they won't misinterpret a fragment and as long as the protocol spec is
clear they should be able to parse the Fragment Header option if they
want.

Tom


> Fred
>
> > -----Original Message-----
> > From: Tom Herbert <tom@herbertland.com>
> > Sent: Tuesday, December 12, 2023 10:36 AM
> > To: Templin (US), Fred L <Fred.L.Templin@boeing.com>
> > Cc: Christian Huitema <huitema@huitema.net>; IPv6 List <ipv6@ietf.org>
> > Subject: [EXTERNAL] Re: [IPv6] FW: I-D Action: draft-templin-6man-ipid-ext-00.txt
> >
> > EXT email: be mindful of links/attachments.
> >
> >
> >
> > On Tue, Dec 12, 2023 at 10:21 AM Templin (US), Fred L
> > <Fred.L.Templin@boeing.com> wrote:
> > >
> > > Tom, I want the  Extended Fragment Header to appear in the Per-Fragment headers in case
> > > a network intermediate system (e.g., a router, a bridge, a packet filter, etc.) needs to inspect
> > > the Identification value.
> >
> > Fred,
> >
> > That's not a protocol requirement for correct operation. Besides those
> > devices can do DPI to get that information like they already do to get
> > transport port numbers or the fragment identification out of the Frag
> > Header.
> >
> > > It also needs to be in each fragment to support the reassembly
> > > process at the destination.
> > >
> > > So, I want this option to be part of the Per-Fragment headers. This allows the Extension
> > > Headers and Upper Layer header to appear only in the first fragment, with the Fragmentable
> > > Part of the packet appearing after that. But, the Extended Fragment Header needs to appear
> > > in each fragment - not just the first fragment.
> >
> > Placing the Extended Fragment Header in DestOpts after the Routing
> > Header would mean that it appears in each fragment. This might
> > motivate an additional DestOpts that at the Frag Header position in
> > the recommended EH order. So we might have DestOpts before RH,
> > DestOpts after RH with fragment option, DestOpts after RH and after
> > DestOPts with fragment option.
> >
> > Tom
> >
> > >
> > > Fred
> > >
> > > > -----Original Message-----
> > > > From: Tom Herbert <tom@herbertland.com>
> > > > Sent: Tuesday, December 12, 2023 10:12 AM
> > > > To: Templin (US), Fred L <Fred.L.Templin@boeing.com>
> > > > Cc: Christian Huitema <huitema@huitema.net>; IPv6 List <ipv6@ietf.org>
> > > > Subject: Re: [IPv6] FW: I-D Action: draft-templin-6man-ipid-ext-00.txt
> > > >
> > > > On Tue, Dec 12, 2023 at 9:51 AM Templin (US), Fred L
> > > > <Fred.L.Templin@boeing.com> wrote:
> > > > >
> > > > > Tom, thank you for these. I am currently pretty deep into a draft revision that addresses Christian's
> > > > > points and should also address several of your points. One question though:
> > > > >
> > > > > > I don't see how this can work if a Routing Header is present. If the
> > > > > > fragment option is in DestOpts before the Routing Header that implies
> > > > > > that each segment would need to do reassembly since only the
> > > > > > reassembled packet contains the Routing Header. It's actually worse
> > > > > > than that, because the first hop would reassemble the packet and then
> > > > > > try to forward the packet, but the reassembled packet is likely to
> > > > > > exceed the MTU so the packet can't be forwarded and the node can't
> > > > > > fragment the packet again because it's not the source of the packet.
> > > > > > Fragmentation really needs to be done after the Routing Header which
> > > > > > is the recommended ordering in RFC8200 for Frag header.
> > > > >
> > > > > No, the Routing Header would still be part of the Per-Fragment headers even though
> > > > > it appears sequentially after the Extended Fragment Header. The Fragmentable Part
> > > > > of the packet still begins after the Routing Header; not before. Perhaps some words
> > > > > along these lines in the draft would help clarify?
> > > >
> > > > Fred,
> > > >
> > > > That would mean that a host would need to process Destination Options
> > > > and see the fragment option, remember it, and continue processing the
> > > > extension headers. If there is a routing header and last segments
> > > > equal zero then reassembly can happen. AFAIK, this split processing
> > > > approach is inconsistent with all other EH processing.
> > > >
> > > > Before adding words, can you explain why the fragment option needs to
> > > > be before the routing header? In IPv6 only the source host can
> > > > fragment, and in both IPv4 or IPv6 only the final destination can
> > > > reassemble the packet. Fragmentation is an end to end function,
> > > > neither routers nor intermediate destinations in the router list of
> > > > Routing Header have any protocol requirements to access the fragment.
> > > > I don't see the reason to deviate from the ordering of Frag header
> > > > recommended in RFC8200.
> > > >
> > > > Tom
> > > >
> > > > >
> > > > > Fred
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Tom Herbert <tom@herbertland.com>
> > > > > > Sent: Tuesday, December 12, 2023 9:31 AM
> > > > > > To: Templin (US), Fred L <Fred.L.Templin@boeing.com>
> > > > > > Cc: Christian Huitema <huitema@huitema.net>; IPv6 List <ipv6@ietf.org>
> > > > > > Subject: Re: [IPv6] FW: I-D Action: draft-templin-6man-ipid-ext-00.txt
> > > > > >
> > > > > > Hi Fred,
> > > > > >
> > > > > > Here are some comments on the latest draft.
> > > > > >
> > > > > > Both the abstract and introduction lead by discussing IPv4
> > > > > > fragmentation. Should only be talking about IPv6 here.
> > > > > >
> > > > > > The definitions of "source" and "destination" are confusing and not
> > > > > > typical. Typically, a source host is the source of an IP packet and
> > > > > > identified by the source address, a destination host is the
> > > > > > destination of an IP packet and is addressed by the destination
> > > > > > address. In the presence of a routing header the "final destination"
> > > > > > is the last address in the route list.
> > > > > >
> > > > > > "router" is the common term for "intermediate systems". I suggest just
> > > > > > using "router" instead.
> > > > > >
> > > > > > "Upper layer protocols often achieve greater performance by
> > > > > > configuring segment sizes that exceed the path Maximum Transmission
> > > > > > Unit (MTU)." I am not at all convinced that this is true, especially
> > > > > > for TCP which has been optimized both in the protocol and
> > > > > > implementation for segment size to equal Path MTU. In any case, I
> > > > > > think this is unnecessary discussion in the draft-- the motivation for
> > > > > > increasing the size of the fragment identifier is that the identifier
> > > > > > is too small for high speed networks. Quantifying "too small" would be
> > > > > > good here: 16 bits in IPv4 is obviously a problem, but what are the
> > > > > > conditions for which 32 bits IPv6 identification is too small?
> > > > > >
> > > > > > "Index/P/S             a control octet that identifies the components
> > > > > > of an IP Parcel [I-D.templin-intarea-parcels]"
> > > > > >
> > > > > > This creates a dependency on a much larger draft. I suggest just
> > > > > > reserve these bits and define them in IP parcels as an update to this
> > > > > > draft.
> > > > > >
> > > > > > "The Extended Fragment Header is included in a Per-Fragment
> > > > > > Destination Options Header following the Hop-by-Hop Options (if
> > > > > > present) but before the Routing Header (if present)"
> > > > > >
> > > > > > I don't see how this can work if a Routing Header is present. If the
> > > > > > fragment option is in DestOpts before the Routing Header that implies
> > > > > > that each segment would need to do reassembly since only the
> > > > > > reassembled packet contains the Routing Header. It's actually worse
> > > > > > than that, because the first hop would reassemble the packet and then
> > > > > > try to forward the packet, but the reassembled packet is likely to
> > > > > > exceed the MTU so the packet can't be forwarded and the node can't
> > > > > > fragment the packet again because it's not the source of the packet.
> > > > > > Fragmentation really needs to be done after the Routing Header which
> > > > > > is the recommended ordering in RFC8200 for Frag header.
> > > > > >
> > > > > > Congestion and packet loss management, fragment retransmission,
> > > > > > capabilities negotiation suggested by probing, and fragment
> > > > > > acknowledgments all fall under the auspices of the transport layer. If
> > > > > > we're introducing these in the network layer then I think there needs
> > > > > > to be more depth in the description and consideration of transport
> > > > > > layer requirements.
> > > > > >
> > > > > > As an example, consider the interaction with TCP slow start. When a
> > > > > > host starts sending to a destination is it allowed to immediately send
> > > > > > packets composed of 64 fragments? If it does that, the sender is
> > > > > > basically bypassing the Slow Start and isn't being very TCP friendly.
> > > > > > Even if fragmentation provides some performance benefit to the source
> > > > > > host in this case, it may be getting that benefit at the expense of
> > > > > > others. When we look at the performance of a protocol we really need
> > > > > > to consider the effects on the network as a whole, not just at the
> > > > > > endpoints of communication.
> > > > > >
> > > > > > Also, it's not clear to me what the application is for these transport
> > > > > > layer aspects. For instance, we know running two independent
> > > > > > congestion control loops for the same packet wreaks havoc on the upper
> > > > > > protocol which in this case is TCP (high variances, unnecessary
> > > > > > retransmission, etc.). I don't believe transport layer aspects of
> > > > > > fragmentation are useful with TCP or QUIC, do you have a use case in
> > > > > > mind for these?
> > > > > >
> > > > > > Tom
> > > > > >
> > > > > > On Mon, Dec 11, 2023 at 1:13 PM Templin (US), Fred L
> > > > > > <Fred.L.Templin@boeing.com> wrote:
> > > > > > >
> > > > > > > Tom et al, there have been some significant changes to the draft that bring it more
> > > > > > > in line with both the comments on the list and some of my other writings. I think it
> > > > > > > may be worth another look now if you have time and energy.
> > > > > > >
> > > > > > > Thanks - Fred
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: ipv6 <ipv6-bounces@ietf.org> On Behalf Of Templin (US), Fred L
> > > > > > > > Sent: Friday, December 08, 2023 11:01 AM
> > > > > > > > To: Tom Herbert <tom@herbertland.com>
> > > > > > > > Cc: Christian Huitema <huitema@huitema.net>; IPv6 List <ipv6@ietf.org>
> > > > > > > > Subject: Re: [IPv6] FW: I-D Action: draft-templin-6man-ipid-ext-00.txt
> > > > > > > >
> > > > > > > > Tom, the service backs off during periods of congestive loss and can resume a more
> > > > > > > > aggressive profile when congestion subsides - the service is therefore adaptive. And,
> > > > > > > > the service is verified to improve performance for TCP and generic UDP as shown in
> > > > > > > > the iperf3 graphs in my Intarea charts. In fact, TCP does best of all.
> > > > > > > >
> > > > > > > > Thank you - Fred
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Tom Herbert <tom@herbertland.com>
> > > > > > > > > Sent: Friday, December 08, 2023 9:12 AM
> > > > > > > > > To: Templin (US), Fred L <Fred.L.Templin@boeing.com>
> > > > > > > > > Cc: Christian Huitema <huitema@huitema.net>; IPv6 List <ipv6@ietf.org>
> > > > > > > > > Subject: Re: [IPv6] FW: I-D Action: draft-templin-6man-ipid-ext-00.txt
> > > > > > > > >
> > > > > > > > > On Fri, Dec 8, 2023 at 7:37 AM Templin (US), Fred L
> > > > > > > > > <Fred.L.Templin@boeing.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Christian, I am working with the DTN LTP over UDP transport, and what I have found is
> > > > > > > > > > the performance is increased only by increasing the segment size even if that size exceeds
> > > > > > > > > > the path MTU. I have shown performance increases with segment sizes all the way up to
> > > > > > > > > > 64KB even over 1500B path MTUs, and I believe that still larger segment sizes (over paths
> > > > > > > > > > with sufficient MTUs) would do even better. This was also a well-known characteristic of
> > > > > > > > > > NFS over UDP back in the early days, and I believe we will find other transports today that
> > > > > > > > > > would benefit from larger packets.
> > > > > > > > > >
> > > > > > > > > > I have tried many ways to apply the "conventional wisdom" you have expressed to LTP/UDP
> > > > > > > > > > but have seen no appreciable performance increases using those methods. I tried using
> > > > > > > > > > sendmmsg()/recvmmsg() and they did nothing to improve performance. I then implemented
> > > > > > > > > > GSO/GRO and again the performance increase if any was minimal. I even implemented a
> > > > > > > > > > first pass at IP parcels and sent 64KB parcels with ~1500B segments over an OMNI interface
> > > > > > > > > > and that did give some minor performance increase due to the reduction in header
> > > > > > > > > > overhead but nothing within the realm of simply sending larger packets where the
> > > > > > > > > > performance increases were multiplicative.
> > > > > > > > > >
> > > > > > > > > > I object to categorizing this as a transport issue - this is an Internetworking issue where
> > > > > > > > > > large packet sizes currently are not well supported especially when they exceed the path
> > > > > > > > > > MTU. I believe many transports will benefit from using larger packets, and that a robust
> > > > > > > > > > fragmentation and reassembly service is essential for performance maximization in the
> > > > > > > > > > Internet, and my drafts clearly explain why that is so.
> > > > > > > > >
> > > > > > > > > Fred,
> > > > > > > > >
> > > > > > > > > For transport protocols dealing with segments the interaction with
> > > > > > > > > fragmentation can't be ignored. Consider if there is a 1% packet loss
> > > > > > > > > in a path for a flow. If one segment equals one path MTU (no
> > > > > > > > > fragmentation), then 1% of segments are dropped, If one segment equals
> > > > > > > > > two MTUs with fragmentation then 2% of the segments are dropped, if
> > > > > > > > > one segment equals four MTUs then 4% are dropped, If one segment
> > > > > > > > > equals 32 MTUs then 32% of segments are dropped. Dropped segments need
> > > > > > > > > to be retransmitted and those retransmitted segments are subject to
> > > > > > > > > packet loss also so the goodput for the connection can quickly drop
> > > > > > > > > off a cliff when using fragmentation. As I mentioned this is
> > > > > > > > > exacerbated by the fact that the fragments themselves can be the
> > > > > > > > > source of congestion causing packet loss in the network.
> > > > > > > > >
> > > > > > > > > I think your argument that fragmentation is essential to the Internet
> > > > > > > > > would be stronger if you can show why packet loss isn't a big problem
> > > > > > > > > for transport protocols that use segments as the unit of congestion
> > > > > > > > > control and retransmission. Also, your focus for analysis seems to be
> > > > > > > > > on LTP, but if you want to make a general argument that fragmentation
> > > > > > > > > is essential for the whole Internet I suggest showing how TCP and QUIC
> > > > > > > > > behave when their segments are fragmented with varying amounts of
> > > > > > > > > packet loss in the path.
> > > > > > > > >
> > > > > > > > > Tom
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Fred
> > > > > > > > > >
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: Christian Huitema <huitema@huitema.net>
> > > > > > > > > > > Sent: Thursday, December 07, 2023 3:59 PM
> > > > > > > > > > > To: Tom Herbert <tom@herbertland.com>; Templin (US), Fred L <Fred.L.Templin@boeing.com>
> > > > > > > > > > > Cc: IPv6 List <ipv6@ietf.org>
> > > > > > > > > > > Subject: Re: [IPv6] FW: I-D Action: draft-templin-6man-ipid-ext-00.txt
> > > > > > > > > > >
> > > > > > > > > > > On 12/7/2023 11:51 AM, Tom Herbert wrote:
> > > > > > > > > > > > On Thu, Dec 7, 2023 at 7:58 AM Templin (US), Fred L
> > > > > > > > > > > > <Fred.L.Templin=40boeing.com@dmarc.ietf.org>  wrote:
> > > > > > > > > > > >> Tom, to the point on performance:
> > > > > > > > > > > >>
> > > > > > > > > > > >>> Please provide references to these studies. Also, note IP
> > > > > > > > > > > >>> fragmentation is only one possibility, PMTUD and transport layer
> > > > > > > > > > > >>> segmentation is another and that latter seems more prevalent.
> > > > > > > > > > > >> If by transport layer segmentation you mean GSO/GRO, it is not the same thing
> > > > > > > > > > > >> as IP fragmentation at all. GSO/GRO provide a means for the application of the
> > > > > > > > > > > >> source to transfer a block of data containing multiple MTU- or smaller-sized
> > > > > > > > > > > >> segments to the kernel in a single system call, then the kernel breaks the
> > > > > > > > > > > >> segments out into individual packets that are all no larger than the path MTU
> > > > > > > > > > > >> and sends them to the destination. The destination kernel then gathers them
> > > > > > > > > > > >> up and posts them to the local application in a reassembled buffer possibly
> > > > > > > > > > > >> as large as that used by the original source. But, if some packets are lost,
> > > > > > > > > > > >> the destination kernel instead sends up what it has gathered so far which
> > > > > > > > > > > >> may be less than the block used by the original source.
> > > > > > > > > > > >>
> > > > > > > > > > > >> IP fragmentation is very different and operates on a single large transport
> > > > > > > > > > > >> layer segment instead of multiple smaller ones. And, the studies I am referring
> > > > > > > > > > > >> to show that performance was most positively affected by increasing the
> > > > > > > > > > > >> segment size even to larger than the path MTU. I implemented GSO/GRO
> > > > > > > > > > > >> in the ion-dtn LTP/UDP implementation and noted that the performance
> > > > > > > > > > > >> increase I saw was very minor and related to more efficient packaging
> > > > > > > > > > > >> and not a system call bottleneck. Conversely, when I increased the segment
> > > > > > > > > > > >> sizes to larger than the path MTU and intentionally invoked IP fragmentation
> > > > > > > > > > > >> the performance increase was dramatic. You can see this in the charts I
> > > > > > > > > > > >> showed at IETF118 intarea here:
> > > > > > > > > > > >>
> > > > > > > > > > > >> https://datatracker.ietf.org/meeting/118/materials/slides-118-intarea-identification-extension-for-the-internet-
> > protocol-
> > > > 00
> > > > > > > > > > >
> > > > > > > > > > > I don't doubt your experience, but this is not what we saw with QUIC. In
> > > > > > > > > > > the early stages of QUIC development, the performance were gated by the
> > > > > > > > > > > cost of the UDP socket API. I have benchmarks showing that sendmsg was
> > > > > > > > > > > accounting for 70 to 80% of CPU on sender side. Using GSO was key to
> > > > > > > > > > > lowering that, with one single call to sendmsg for 64K worth of data.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >> Again, GSO/GRO address performance limitations of the application/kernel
> > > > > > > > > > > >> system call interface which seems to have a positive performance effect for
> > > > > > > > > > > >> some applications. But, IP fragmentation addresses a performance limitation
> > > > > > > > > > > >> of transport layer protocols in allowing the transport protocol to use larger
> > > > > > > > > > > >> segment sizes and therefore have fewer segments to deal with.
> > > > > > > > > > >
> > > > > > > > > > > At the cost of very inefficient error correction, repeating 64K bytes if
> > > > > > > > > > > 1500 bytes are lost. The processing cost of retransmissions with
> > > > > > > > > > > selective acknowledgement is not large, it hardly shows in the flame
> > > > > > > > > > > graphs. Also, the next more important cost after sendmsg/recvmsg is the
> > > > > > > > > > > cost of encryption. If the application had to resend 64KB, it also has
> > > > > > > > > > > to encrypt 64KB again, and that costs more than re-encrypting 1500B.
> > > > > > > > > > > Given that, I am not sure that for QUIC we would see a lower CPU by
> > > > > > > > > > > delegating fragmentation to the IP stack.
> > > > > > > > > > >
> > > > > > > > > > > That does not mean that larger packets would not result in lower CPU
> > > > > > > > > > > load. It would, but only if the larger packet size did not involve
> > > > > > > > > > > fragmentation, reassembly, and the overhead caused by the occasional
> > > > > > > > > > > loss of a fragment.
> > > > > > > > > > >
> > > > > > > > > > > > Hi Fred,
> > > > > > > > > > > >
> > > > > > > > > > > > Fewer segments, but NOT fewer packets. The net amount of work in the
> > > > > > > > > > > > system is unchanged when sending larger segments instead of smaller so
> > > > > > > > > > > > there won't be any material performance differences other than maybe
> > > > > > > > > > > > implementation effects at the host and no effect at routers. Segments
> > > > > > > > > > > > are the unit of congestion management and retransmission in a
> > > > > > > > > > > > transport protocol, but fragments are transparent to the transport
> > > > > > > > > > > > protocol-- this distinction can cause material issues in performance.
> > > > > > > > > > > >
> > > > > > > > > > > > It's pretty easy to see why this is. Consider that the minimum number
> > > > > > > > > > > > of segments for a connection would be to use 64K segments and fragment
> > > > > > > > > > > > them. For a 1500 MTU one segment then would be sent in 43 fragments.
> > > > > > > > > > > > The problem is that if just one fragment is dropped in a segment then
> > > > > > > > > > > > the whole segment is retransmitted. Furthermore, the fragments
> > > > > > > > > > > > themselves are likely to be the cause of the congestion at routers. So
> > > > > > > > > > > > there is a high likelihood of creating congestion in the network and
> > > > > > > > > > > > needing a lot of retransmissions. Even if CWND goes to one, each
> > > > > > > > > > > > connection can still send 43 packets and SACKs don't help because
> > > > > > > > > > > > there's no granularity at 64K segments so congestion control really
> > > > > > > > > > > > wouldn't be effective. The net effect is likely to be very poor TCP
> > > > > > > > > > > > performance.
> > > > > > > > > > >
> > > > > > > > > > > Yes. That's actually a known issue with GSO, and why GSO is typically
> > > > > > > > > > > limited to no more than 64K. If the sender does not implement some form
> > > > > > > > > > > of pacing, the segments will be sent back to back, causing short peaks
> > > > > > > > > > > of traffic that can cause queues to fill up and overflow. But it is
> > > > > > > > > > > difficult to delegate this pacing to the kernel, because the API only
> > > > > > > > > > > expresses the pacing in "milliseconds between packets". Segmentation in
> > > > > > > > > > > the kernel or the drivers would have the same issues.
> > > > > > > > > > >
> > > > > > > > > > > > While I think there might be some incidental positive performance
> > > > > > > > > > > > effects in host implementation by using fragmentation, I really don't
> > > > > > > > > > > > see how it addresses any fundamental performance limitation in a
> > > > > > > > > > > > transport layer protocol like TCP. In fact, I don't see how IP
> > > > > > > > > > > > fragmentation could possibly be better than doing PMTUD with SACKs
> > > > > > > > > > > > especially on the Internet.
> > > > > > > > > > >
> > > > > > > > > > > Yet another issue is that Fred is not the only one with that particular
> > > > > > > > > > > bad idea. The UDP options defined in TSVWG include a
> > > > > > > > > > > sgementation/fragmentation option that looks very similar. The two bad
> > > > > > > > > > > ideas would probably have to be reconciled in a single bad idea.
> > > > > > > > > > >
> > > > > > > > > > > In any case, Fred is making arguments related to transport, which means
> > > > > > > > > > > this draft ought to be discussed in TSVWG.
> > > > > > > > > > >
> > > > > > > > > > > -- Christian Huitema
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > > > --------------------------------------------------------------------
> > > > > > > > IETF IPv6 working group mailing list
> > > > > > > > ipv6@ietf.org
> > > > > > > > Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
> > > > > > > > --------------------------------------------------------------------
> > > > >
> > >
>