Re: [Int-area] IP Parcels improves performance for end systems
Tom Herbert <tom@herbertland.com> Thu, 24 March 2022 19:51 UTC
Return-Path: <tom@herbertland.com>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CF6843A12DB for <int-area@ietfa.amsl.com>; Thu, 24 Mar 2022 12:51:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.905
X-Spam-Level:
X-Spam-Status: No, score=-1.905 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20210112.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HCNxoGjFENlw for <int-area@ietfa.amsl.com>; Thu, 24 Mar 2022 12:51:42 -0700 (PDT)
Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 07BBB3A12BE for <int-area@ietf.org>; Thu, 24 Mar 2022 12:51:41 -0700 (PDT)
Received: by mail-lj1-x22b.google.com with SMTP id 17so7630407lji.1 for <int-area@ietf.org>; Thu, 24 Mar 2022 12:51:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=H6q6qoM6D606fzLuzHYSa8M+tiXc1/udVBEp3u8IRr0=; b=kOSMHs8SVTz8Ui+uxpsU6Q2UL/lxmZxwxaTWNlvUSHboPiDh+T2zwpjvV6bUs3HIoN rmzTRrRsw83lQbjYLCFiR+VT3T3cK4tZsYGGPuhw960ICAhzn+vys8Y38qgHq9FiL+a7 1yt5Zkhw6gCsEjrm+ESkLWwOlPiKKm5M8KUBFnYP7gpf6dTGKdF4C82d3z6VzHo7ONlG qo6mjXjJVXuwcpP7el9nbwBwZVZKWYSV8mAU0W/B/GtfC68ldaYvcWbYWgGGelWWl+RF bIdUZWLg5kMXNvJC+xBsSS9EkFg3y1/7ZgtdNL6gvO6PUKe01heGzNphnBhdS1Z9VWTe jSzQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=H6q6qoM6D606fzLuzHYSa8M+tiXc1/udVBEp3u8IRr0=; b=kK+tF3qWJ4UKFVFZc4IojdQlf2uhO58AM+OuBUaoZNcZUUUJkZnknjfFIRalDFHFXs GW3tPggxTGmbC6mUEY33dClS1z2Efxu8ZiRBzLK0GosZ+vjCrWGQ48Dv9ahjNuwWeRn8 v3Ksb2cq+mHtdBuUdqD4o3hkQkLb4tLweuE1QFu+5epgrTtFC70mwAOOjdFC5KEyDqvG Tvg+5Ff48r2MRofqIh02XE6M9V2DbYSSx5bC5MFrdYSr+kpGMx0DitJgcy/I/VC10d9r hYT14p1XGQwP3TXjM5EUIgfVR5J/O0mtdwgmwlUjX7hP9RJTc/Mu63wQ5/5payplaHvW SI0A==
X-Gm-Message-State: AOAM533daenjsAlKx9uvqEqO1WIqfe5d+PWZM8URF6RUQJsZ9oNistgg /MT+mQM/xtzcr0PqgBx444lUh0JzONUUbG6TFX6d2w==
X-Google-Smtp-Source: ABdhPJzj4nz0VV8sqIISXZOKDc9ZCrcm82qG+iLhvjqC0mTuMbHfwykoWrdXQwEySon997YeOSN7fMock8gdv+bPcN0=
X-Received: by 2002:a2e:2d11:0:b0:246:3c3e:d544 with SMTP id t17-20020a2e2d11000000b002463c3ed544mr5134243ljt.518.1648151499300; Thu, 24 Mar 2022 12:51:39 -0700 (PDT)
MIME-Version: 1.0
References: <90a1ce8325a448ab81f63c844f98d6a6@boeing.com> <bd1e4a5a-2d09-3875-2135-6f6b6743a9cf@joelhalpern.com>
In-Reply-To: <bd1e4a5a-2d09-3875-2135-6f6b6743a9cf@joelhalpern.com>
From: Tom Herbert <tom@herbertland.com>
Date: Thu, 24 Mar 2022 12:51:27 -0700
Message-ID: <CALx6S37yoA-Bmz0QPZX2_07SeLgZQAnbMUeDbkNNKXCvANeibw@mail.gmail.com>
To: "Joel M. Halpern" <jmh@joelhalpern.com>
Cc: "Templin (US), Fred L" <Fred.L.Templin@boeing.com>, int-area <int-area@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000a6f6f905dafc2cc9"
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/_XWFbez6c5wcmj91lOgT5widKSM>
Subject: Re: [Int-area] IP Parcels improves performance for end systems
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Internet Area WG Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-area/>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Mar 2022 19:51:48 -0000
On Thu, Mar 24, 2022, 3:11 PM Joel M. Halpern <jmh@joelhalpern.com> wrote: > I do remember token ring. (I was working from 1983 for folks who > delivered 50 megabits starting in 1976, and built some of the best FDDI > around at the time.) > > I am not claiming that increasing the MTU from 1500 to 9K did nothing. > I am claiming that diminishing returns has distinctly set in. > If the Data Center folks (who tend these days to have the highest > demand) really want a 64K link, they would have one. Joel, Indeed. Google, at least, is looking into it at least insofar as getting bigger packets for GRO/GSO. See https://netdevconf.info/0x15/session.html?BIG-TCP Tom They don't. They > prefer to use Ethernet. > The improvement via increasing the MTU further runs into many obstacles, > including such issues as error detection code coverage), application > desired communication size, retransmission costs, and on and on. > Yes, they can all be overcome. But the returns get smaller and smaller. > > So absent real evidence that there is a problem needing the network > stack and protocol to change, I just don't see this (IP Parcels) as > providing enough benefit to justify the work. > > > Yours, > Joel > > On 3/24/2022 3:05 PM, Templin (US), Fred L wrote: > > Hi Joel, > > > >> -----Original Message----- > >> From: Joel M. Halpern [mailto:jmh@joelhalpern.com] > >> Sent: Thursday, March 24, 2022 11:41 AM > >> To: Templin (US), Fred L <Fred.L.Templin@boeing.com> > >> Cc: int-area <int-area@ietf.org> > >> Subject: Re: [Int-area] IP Parcels improves performance for end systems > >> > >> This exchange seems to assume facts not in evidence. > > > > It is a fact that back in the 1980's the architects took simple token > ring, > > changed the over-the-wire coding to 4B/5B, replaced the copper with > > fiber and then boosted the MTU by a factor of 3 and called it FDDI. They > > were able to claim what at the time was an astounding 100Mbps (i.e., in > > comparison to the 10Mbps Ethernet of the day), but the performance > > gain was largely due to the increase in the MTU. They told me: "Fred, > > go figure out the path MTU problem", and they said: "go talk to Jeff > > Mogul out in Palo Alto who knows something about it". But, then, the > > Path MTU discovery group took a left turn at Albuquerque and left the > > Internet as a tiny MTU wasteland. We have the opportunity to fix all > > of that now - so, let's get it right for once. > > > > Fred > > > > > >> > >> And the whole premise is spending resources in other parts of the > >> network for a marginal diminishing return in the hosts. > >> > >> It simply does not add up. > >> > >> Yours, > >> Joel > >> > >> On 3/24/2022 2:19 PM, Templin (US), Fred L wrote: > >>>> The category 1) links are not yet in existence, but once parcels > start to > >>>> enter the mainstream innovation will drive the creation of new kinds > of > >>>> data links (1TB Ethernet?) that will be rolled out as new hardware. > >>> > >>> I want to put a gold star next to the above. AFAICT, pushing the MTU > and > >>> implementing IP parcels can get us to 1TB Ethernet practically > overnight. > >>> Back in the 1980's, FDDI proved that pushing to larger MTUs could boost > >>> throughput without changing the speed of light, so why wouldn't the > same > >>> concept work for Ethernet in the modern era? > >>> > >>> Fred > >>> > >>>> -----Original Message----- > >>>> From: Int-area [mailto:int-area-bounces@ietf.org] On Behalf Of > Templin (US), Fred L > >>>> Sent: Thursday, March 24, 2022 9:45 AM > >>>> To: Tom Herbert <tom@herbertland.com> > >>>> Cc: int-area <int-area@ietf.org>; Eggert, Lars <lars@netapp.com>; > lars@eggert.org > >>>> Subject: Re: [Int-area] IP Parcels improves performance for end > systems > >>>> > >>>> Hi Tom - responses below: > >>>> > >>>>> -----Original Message----- > >>>>> From: Tom Herbert [mailto:tom@herbertland.com] > >>>>> Sent: Thursday, March 24, 2022 9:09 AM > >>>>> To: Templin (US), Fred L <Fred.L.Templin@boeing.com> > >>>>> Cc: Eggert, Lars <lars@netapp.com>; int-area <int-area@ietf.org>; > lars@eggert.org > >>>>> Subject: Re: [Int-area] IP Parcels improves performance for end > systems > >>>>> > >>>>> On Thu, Mar 24, 2022 at 7:27 AM Templin (US), Fred L > >>>>> <Fred.L.Templin@boeing.com> wrote: > >>>>>> > >>>>>> Tom - see below: > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Tom Herbert [mailto:tom@herbertland.com] > >>>>>>> Sent: Thursday, March 24, 2022 6:22 AM > >>>>>>> To: Templin (US), Fred L <Fred.L.Templin@boeing.com> > >>>>>>> Cc: Eggert, Lars <lars@netapp.com>; int-area <int-area@ietf.org>; > lars@eggert.org > >>>>>>> Subject: Re: [Int-area] IP Parcels improves performance for end > systems > >>>>>>> > >>>>>>> On Wed, Mar 23, 2022 at 10:47 AM Templin (US), Fred L > >>>>>>> <Fred.L.Templin@boeing.com> wrote: > >>>>>>>> > >>>>>>>> Tom, looks like you have switched over to HTML which can be a > real conversation-killer. > >>>>>>>> > >>>>>>>> But, to some points you raised that require a response: > >>>>>>>> > >>>>>>>>> You can't turn it off UDP checksums for IPv6 (except for narrow > case of encapsulation). > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> That sounds like a good reason to continue to use IPv4 – at least > as far as end system > >>>>>>>> > >>>>>>>> addressing is concerned – right? > >>>>>>> > >>>>>>> > >>>>>>> Not at all. All NICs today provide checksum offload and so it's > >>>>>>> basically zero cost to perform the UDP checksum. The fact that we > >>>>>>> don't have to do extra checks on the UDPv6 checksum field to see if > >>>>>>> it's zero actually is a performance improvement over UDPv4. (btw, I > >>>>>>> will present implementation of the Internet checksum at TSVGWG > Friday, > >>>>>>> this will include discussion of checksum offloads). > >>>>>> > >>>>>> Actually, my assertion wasn't good to begin with because for IPv6 > even if UDP > >>>>>> checksums are turned off the OMNI encapsulation layer includes a > checksum > >>>>>> that ensures the integrity of the IPv6 header. UDP checksums off > for IPv6 when > >>>>>> OMNI encapsulation is used is perfectly fine. > >>>>>> > >>>>> I assume you are referring to RFC6935 and RFC6936 that allow the > UDPv6 > >>>>> to be zero for tunneling with a very constrained set of conditions. > >>>>> > >>>>>>>>> If it's a standard per packet Internet checksum then a lot of HW > could do it. If it's something like CRC32 then probably not. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> The integrity check is covered in RFC5327, and I honestly haven’t > had a chance to > >>>>>>>> > >>>>>>>> look at that myself yet. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> LTP is a nice experiment, but I'm more interested as to the > interaction between IP parcels and TCP or QUIC. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Please be aware that while LTP may seem obscure at the moment > that may be changing now > >>>>>>>> > >>>>>>>> that the core DTN standards have been published. As DTN use > becomes more widespread I > >>>>>>>> > >>>>>>>> think we can see LTP also come into wider adoption. > >>>>>>> > >>>>>>> > >>>>>>> My assumption is that IP parcels is intended to be a general > solution > >>>>>>> of all protocols. Maybe in the next draft you could discuss the > >>>>>>> details of TCP in IP parcels including how to offload the TCP > >>>>>>> checksum. > >>>>>> > >>>>>> I could certainly add that. For TCP, each of the concatenated > segments would > >>>>>> include its own TCP header with checksum field included. Any > hardware that > >>>>>> knows the structure of an IP Parcel can then simply do the TCP > checksum > >>>>>> offload function for each segment. > >>>>> > >>>>> To be honest, the odds of ever getting support in NIC hardware for IP > >>>>> parcels are extremely slim. Hardware vendors are driven by economics, > >>>>> so the only way they would do that would be to demonstrate widespread > >>>>> deployment of the protocol. But even then, with all the legacy > >>>>> hardware in deployment it will take many years before there's any > >>>>> appreciable traction. IMO, the better approach is to figure out how > to > >>>>> leverage the existing hardware features for use with IP parcels. > >>>> > >>>> There will be two kinds of links that will need to be > "Parcel-capable": > >>>> 1) Edge network (physical) links that natively forward large parcels, > and > >>>> 2) OMNI (virtual) links that forward parcels using encapsulation and > >>>> fragmentation. > >>>> > >>>> The category 1) links are not yet in existence, but once parcels > start to > >>>> enter the mainstream innovation will drive the creation of new kinds > of > >>>> data links (1TB Ethernet?) that will be rolled out as new hardware. > And > >>>> that new hardware can be made to understand the structure of parcels > >>>> from the beginning. The category 2) links might take a large parcel > from > >>>> the upper layers on the local node (or one that has been forwarded by > >>>> a parcel-capable link) and break it down into smaller sub-parcels then > >>>> apply IP fragmentation to each sub-parcel and send the fragments to an > >>>> OMNI link egress node. You know better than me how checksum offload > >>>> could be applied in an environment like that. > >>>> > >>>>>>>>> There was quite a bit of work and discussion on this in Linux. I > believe the deviation from the standard was motivated by some > >>>>>>>> > >>>>>>>>> deployed devices required the IPID be set on receive, and > setting IPID with DF equals to 1 is thought to be innocuous. You may > >>>>>>>> > >>>>>>>>> want to look at Alex Duyck's papers on UDP GSO, he wrote a lot > of code in this area. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> RFC6864 has quite a bit to say about coding IP ID with DF=1 – > mostly in the negative. > >>>>>>>> > >>>>>>>> But, what I have seen in the linux code seems to indicate that > there is not even any > >>>>>>>> > >>>>>>>> coordination between the GSO source and the GRO destination – > instead, GRO simply > >>>>>>>> > >>>>>>>> starts gluing together packets that appear to have consecutive IP > IDs without ever first > >>>>>>>> > >>>>>>>> checking that they were sent by a peer that was earnestly doing > GSO. These aspects > >>>>>>>> > >>>>>>>> would make it very difficult to work GSO/GRO into an IETF > standard, plus it doesn’t > >>>>>>>> > >>>>>>>> work for IPv6 at all where there is no IP ID included by default. > IP Parcels addresses > >>>>>>>> > >>>>>>>> all of these points, and can be made into a standard. > >>>>>>> > >>>>>>> > >>>>>>> Huh? GRO/GSO works perfectly fine with IPV6. > >>>>>> > >>>>>> Where is the spec for that? My understanding is that GSO/GRO > leverages the > >>>>>> IP ID for IPv4. But, for IPv6, there is no IP ID unless you include > a Fragment Header. > >>>>>> Does IPv6 somehow do GSO/GRO differently? > >>>>>> > >>>>> > >>>>> GRO and GSO don't use the IPID to match a flow. The primary match is > >>>>> the TCP 4-tuple. > >>>> > >>>> Correct, the 5-tuple (src-ip, src-port, dst-ip, dst-pot, proto) is > what is used > >>>> to match the flow. But, you need more than that in order to correctly > paste > >>>> back together with GRO the segments of an original ULP buffer that was > >>>> broken down by GSO - you need Identifications and/or other markings in > >>>> the IP headers to give a reassembly context. Otherwise, GRO might end > >>>> up gluing together old and new pieces of ULP data and/or impart a lot > of > >>>> reordering. IP Parcels have well behaved Identifications and Parcel > IDs so > >>>> that the original ULP buffer context is honored during reassembly. > >>>> > >>>>> There's also another possibility with IPv6-- use jumbograms. For > >>>>> instance, instead of GRO reassembling segments up to a 64K packet, it > >>>>> could be modified to reassemble up to a 4G packet using IPv6 > >>>>> jumbograms where one really big packet is given to the stack. > >>>>> > >>>>> But we probably don't even need jumbograms for that. In Linux, GRO > >>>>> might be taught to reassemble up to 4G super packet and set a flag > bit > >>>>> in the skbuf to ignore the IP payload field and get the length from > >>>>> the skbuf len field (as though a jumbogram was received). This trick > >>>>> would work for IPV4 and IPv6 and GSO as well. It should also work TSO > >>>>> if the device takes the IP payload length to be that for each > segment. > >>>> > >>>> Yes, I was planning to give that a try to see what kind of performance > >>>> can be gotten with GSO/GRO when you exceed 64KB. But, my concern > >>>> with GSO/GRO is that the reassembly is (relatively) unguided and > >>>> haphazard and can result in mis-ordered concatenations. And, there is > >>>> no protocol by which the GRO receiver can imply that the things it is > >>>> gluing together actually originated from a sender that is earnestly > doing > >>>> GSO. So, I do not see how GSO/GRO as I see it in the implementation > >>>> could be made into a standard, whereas there is a clear path for > >>>> standardizing IP parcels. > >>>> > >>>> Another thing I forgot to mention is that in my experiments with > GSO/GRO > >>>> I found that it won't let me set a GSO segment size that would cause > the > >>>> resulting IP packets to exceed the path MTU (i.e., it won't allow > fragmentation). > >>>> I fixed that by configuring IPv4-in-IPv6 encapsulation per RFC2473 > and then > >>>> allowed the IPv6 layer to apply fragmentation to the encapsulated > packet. > >>>> That way, I can use IPv4 GSO segment sizes up to ~64KB. > >>>> > >>>> Fred > >>>> > >>>>> > >>>>> Tom > >>>>> > >>>>>> Thanks - Fred > >>>>>> > >>>>>>> Tom > >>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Fred > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> From: Tom Herbert [mailto:tom@herbertland.com] > >>>>>>>> Sent: Wednesday, March 23, 2022 9:37 AM > >>>>>>>> To: Templin (US), Fred L <Fred.L.Templin@boeing.com> > >>>>>>>> Cc: Eggert, Lars <lars@netapp.com>; int-area <int-area@ietf.org>; > lars@eggert.org > >>>>>>>> Subject: Re: [EXTERNAL] Re: [Int-area] IP Parcels improves > performance for end systems > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> EXT email: be mindful of links/attachments. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Wed, Mar 23, 2022, 9:54 AM Templin (US), Fred L < > Fred.L.Templin@boeing.com> wrote: > >>>>>>>> > >>>>>>>> Hi Tom, > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Tom Herbert [mailto:tom@herbertland.com] > >>>>>>>>> Sent: Wednesday, March 23, 2022 6:19 AM > >>>>>>>>> To: Templin (US), Fred L <Fred.L.Templin@boeing.com> > >>>>>>>>> Cc: Eggert, Lars <lars@netapp.com>; int-area@ietf.org; > lars@eggert.org > >>>>>>>>> Subject: Re: [Int-area] IP Parcels improves performance for end > systems > >>>>>>>>> > >>>>>>>>> On Tue, Mar 22, 2022 at 10:38 AM Templin (US), Fred L > >>>>>>>>> <Fred.L.Templin@boeing.com> wrote: > >>>>>>>>>> > >>>>>>>>>> Tom, see below: > >>>>>>>>>> > >>>>>>>>>>> -----Original Message----- > >>>>>>>>>>> From: Tom Herbert [mailto:tom@herbertland.com] > >>>>>>>>>>> Sent: Tuesday, March 22, 2022 10:00 AM > >>>>>>>>>>> To: Templin (US), Fred L <Fred.L.Templin@boeing.com> > >>>>>>>>>>> Cc: Eggert, Lars <lars@netapp.com>; int-area@ietf.org > >>>>>>>>>>> Subject: Re: [Int-area] IP Parcels improves performance for > end systems > >>>>>>>>>>> > >>>>>>>>>>> On Tue, Mar 22, 2022 at 7:42 AM Templin (US), Fred L > >>>>>>>>>>> <Fred.L.Templin@boeing.com> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Lars, I did a poor job of answering your question. One of the > most important aspects of > >>>>>>>>>>>> > >>>>>>>>>>>> IP Parcels in relation to TSO and GSO/GRO is that transports > get to use a full 4MB buffer > >>>>>>>>>>>> > >>>>>>>>>>>> instead of the 64KB limit in current practices. This is > possible due to the IP Parcel jumbo > >>>>>>>>>>>> > >>>>>>>>>>>> payload option encapsulation which provides a 32-bit length > field instead of just a 16-bit. > >>>>>>>>>>>> > >>>>>>>>>>>> By allowing the transport to present the IP layer with a > buffer of up to 4MB, it reduces > >>>>>>>>>>>> > >>>>>>>>>>>> the overhead, minimizes system calls and interrupts, etc. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> So, yes, IP Parcels is very much about improving the > performance for end systems in > >>>>>>>>>>>> > >>>>>>>>>>>> comparison with current practice (GSO/GRO and TSO). > >>>>>>>>>>> > >>>>>>>>>>> Hi Fred, > >>>>>>>>>>> > >>>>>>>>>>> The nice thing about TSO/GSO/GRO is that they don't require any > >>>>>>>>>>> changes to the protocol as just implementation techniques, also > >>>>>>>>>>> they're one sided opitmizations meaning for instance that TSO > can be > >>>>>>>>>>> used at the sender without requiring GRO to be used at the > receiver. > >>>>>>>>>>> My understanding is that IP parcels requires new protocol that > would > >>>>>>>>>>> need to be implemented on both endpoints and possibly in some > routers. > >>>>>>>>>> > >>>>>>>>>> It is not entirely true that the protocol needs to be > implemented on both > >>>>>>>>>> endpoints . Sources that send IP Parcels send them into a > Parcel-capable path > >>>>>>>>>> which ends at either the final destination or a router for > which the next hop is > >>>>>>>>>> not Parcel-capable. If the Parcel-capable path extends all the > way to the final > >>>>>>>>>> destination, then the Parcel is delivered to the destination > which knows how > >>>>>>>>>> to deal with it. If the Parcel-capable path ends at a router > somewhere in the > >>>>>>>>>> middle, the router opens the Parcel and sends each enclosed > segment as an > >>>>>>>>>> independent IP packet. The final destination is then free to > apply GRO to the > >>>>>>>>>> incoming IP packets even if it does not understand Parcels. > >>>>>>>>>> > >>>>>>>>>> IP Parcels is about efficient shipping and handling just like > the major online > >>>>>>>>>> retailer service model I described during the talk. The goal is > to deliver the > >>>>>>>>>> fewest and largest possible parcels to the final destination > rather than > >>>>>>>>>> delivering lots of small IP packets. It is good for the network > and good for > >>>>>>>>>> the end systems both. If this were not true, then Amazon would > send the > >>>>>>>>>> consumer 50 small boxes with 1 item each instead of 1 larger > box with all > >>>>>>>>>> 50 items inside. And, we all know what they would choose to do. > >>>>>>>>>> > >>>>>>>>>>> Do you have data that shows the benefits of IP Parcels in > light of > >>>>>>>>>>> these requirements? > >>>>>>>>>> > >>>>>>>>>> I have data that shows that GSO/GRO is good for packaging sizes > up to 64KB > >>>>>>>>>> even if the enclosed segments will require IP fragmentation > upon transmission. > >>>>>>>>>> The data implies that even larger packaging sizes (up to a > maximum of 4MB) > >>>>>>>>>> would be better still. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> Fred, > >>>>>>>>> > >>>>>>>>> You seem to be only looking at the problem from a per packet cost > >>>>>>>>> point of view. There is also per byte cost, particularly in the > >>>>>>>>> computation of the TCP/UDP checksum. The cost is hidden in modern > >>>>>>>>> implementations by checksum offload, and for segmentation > offload we > >>>>>>>>> have methods to preserve the utility of checksum offload. IP > parcels > >>>>>>>>> will have to also leverage checksum offload, because if the > checksum > >>>>>>>>> is not offloaded then the cost of computing the payload checksum > in > >>>>>>>>> CPU would dwarf any benefits we'd get by using segments larger > than > >>>>>>>>> 64K. > >>>>>>>> > >>>>>>>> There is plenty of opportunity to apply hardware checksum offload > since > >>>>>>>> the structure of a Parcel will be very standard. My experiments > have been > >>>>>>>> with a protocol called LTP which is layered over UDP/IP as some > other > >>>>>>>> upper layer protocols are. LTP includes a segment-by-segment > checksum > >>>>>>>> that is used at its level in the absence of lower layer integrity > checks, so > >>>>>>>> for larger Parcels LTP would use that and turn off UDP checksums > >>>>>>>> altogether. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> You can't turn it off UDP checksums for IPv6 (except for narrow > case of encapsulation). > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> As far as I am aware, there are currently no hardware > >>>>>>>> checksum offload implementations available for calculating the > >>>>>>>> LTP checksums. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> If it's a standard per packet Internet checksum then a lot of HW > could do it. If it's something like CRC32 then probably not. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> LTP is a nice experiment, but I'm more interested as to the > interaction between IP parcels and TCP or QUIC. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Speaking of standard, AFAICT GSO/GRO are doing something very > >>>>>>>> non-standard. GSO seems to be coding the IP ID field in the IPv4 > >>>>>>>> headers of packets with DF=1 which goes against RFC 6864. When > >>>>>>>> DF=1, GSO cannot simply claim the IP ID and code it as if there > were > >>>>>>>> some sort of protocol. Or, if it does, there would be no way to > >>>>>>>> standardize it. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> There was quite a bit of work and discussion on this in Linux. I > believe the deviation from the standard was motivated by some > >>>>> deployed > >>>>>>> devices required the IPID be set on receive, and setting IPID with > DF equals to 1 is thought to be innocuous. You may want to look at > >>>>> Alex > >>>>>>> Duyck's papers on UDP GSO, he wrote a lot of code in this area. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Tom > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Fred > >>>>>>>> > >>>>>>>>> > >>>>>>>>> Tom > >>>>>>>>> > >>>>>>>>>> Fred > >>>>>>>>>> > >>>>>>>>>>> Thanks, > >>>>>>>>>>> Tom > >>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks - Fred > >>>>>>>>>>>> > >>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>> Int-area mailing list > >>>>>>>>>>>> Int-area@ietf.org > >>>>>>>>>>>> https://www.ietf.org/mailman/listinfo/int-area > >>>>>> > >>>> > >>>> _______________________________________________ > >>>> Int-area mailing list > >>>> Int-area@ietf.org > >>>> https://www.ietf.org/mailman/listinfo/int-area > >>> _______________________________________________ > >>> Int-area mailing list > >>> Int-area@ietf.org > >>> https://www.ietf.org/mailman/listinfo/int-area > > > > _______________________________________________ > Int-area mailing list > Int-area@ietf.org > https://www.ietf.org/mailman/listinfo/int-area >
- [Int-area] IP Parcels improves performance for en… Templin (US), Fred L
- Re: [Int-area] IP Parcels improves performance fo… Tom Herbert
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Robinson, Herbie
- Re: [Int-area] IP Parcels improves performance fo… Robinson, Herbie
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Tom Herbert
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Templin (US), Fred L
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] IP Parcels improves performance fo… Tom Herbert
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Templin (US), Fred L
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Tom Herbert
- Re: [Int-area] IP Parcels improves performance fo… Tom Herbert
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Templin (US), Fred L
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Tom Herbert
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Templin (US), Fred L
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] IP Parcels improves performance fo… Joel M. Halpern
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] IP Parcels improves performance fo… Joel M. Halpern
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] IP Parcels improves performance fo… Haoyu Song
- Re: [Int-area] IP Parcels improves performance fo… Joel M. Halpern
- Re: [Int-area] IP Parcels improves performance fo… Tom Herbert
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Templin (US), Fred L
- Re: [Int-area] [EXTERNAL] Re: IP Parcels improves… Joel M. Halpern
- Re: [Int-area] IP Parcels improves performance fo… Haoyu Song
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] IP Parcels improves performance fo… Joel M. Halpern
- Re: [Int-area] IP Parcels improves performance fo… Dino Farinacci
- Re: [Int-area] IP Parcels improves performance fo… Templin (US), Fred L
- Re: [Int-area] IP Parcels & jumbo frames John Gilmore
- Re: [Int-area] [EXTERNAL] Re: IP Parcels & jumbo … Templin (US), Fred L