Re: Fun and surprises with IPv6 fragmentation
Patrick McManus <pmcmanus@mozilla.com> Sat, 03 March 2018 15:23 UTC
Return-Path: <pmcmanus@mozilla.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 84094124B18 for <quic@ietfa.amsl.com>; Sat, 3 Mar 2018 07:23:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.233
X-Spam-Level:
X-Spam-Status: No, score=-1.233 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 10QWQokna7LP for <quic@ietfa.amsl.com>; Sat, 3 Mar 2018 07:23:52 -0800 (PST)
Received: from linode64.ducksong.com (www.ducksong.com [192.155.95.102]) by ietfa.amsl.com (Postfix) with ESMTP id 0086412420B for <quic@ietf.org>; Sat, 3 Mar 2018 07:23:51 -0800 (PST)
Received: from mail-oi0-f43.google.com (mail-oi0-f43.google.com [209.85.218.43]) by linode64.ducksong.com (Postfix) with ESMTPSA id AAF323A021 for <quic@ietf.org>; Sat, 3 Mar 2018 10:23:50 -0500 (EST)
Received: by mail-oi0-f43.google.com with SMTP id c12so9138975oic.7 for <quic@ietf.org>; Sat, 03 Mar 2018 07:23:50 -0800 (PST)
X-Gm-Message-State: AElRT7Gt1mw1pKkBlQSC0GnHwLt5LeJ9/mfcSgZDpY+CqEXCVvNZG7nZ o5J6h6bS4RzUdz359D+n1y+Hb4yJmaiaWIF+Jiw=
X-Google-Smtp-Source: AG47ELsJS0qCO29S1qrXe7PDCdusW8I6S3SuXGZIl6xjrw4C6facuDxRaHzSNS0K41UWs8Fjyh0Pxp4kQFUSqj82As4=
X-Received: by 10.202.188.70 with SMTP id m67mr3043792oif.132.1520090630299; Sat, 03 Mar 2018 07:23:50 -0800 (PST)
MIME-Version: 1.0
Received: by 10.74.66.212 with HTTP; Sat, 3 Mar 2018 07:23:49 -0800 (PST)
In-Reply-To: <CAJ_4DfS=6h9qEQ+uwntLtDZNSODhqc_0pww7c2gK50XKna0BCw@mail.gmail.com>
References: <681fcc96-4cf9-100d-9ad6-b3c7be9189a5@huitema.net> <CAJ_4DfS=6h9qEQ+uwntLtDZNSODhqc_0pww7c2gK50XKna0BCw@mail.gmail.com>
From: Patrick McManus <pmcmanus@mozilla.com>
Date: Sat, 03 Mar 2018 10:23:49 -0500
X-Gmail-Original-Message-ID: <CAOdDvNqRD=NqbmDaTDi5t-iPy_sB-bjHcpeVPgEXZnN04DnRSQ@mail.gmail.com>
Message-ID: <CAOdDvNqRD=NqbmDaTDi5t-iPy_sB-bjHcpeVPgEXZnN04DnRSQ@mail.gmail.com>
Subject: Re: Fun and surprises with IPv6 fragmentation
To: Ryan Hamilton <rch=40google.com@dmarc.ietf.org>
Cc: Christian Huitema <huitema@huitema.net>, "quic@ietf.org" <quic@ietf.org>
Content-Type: multipart/alternative; boundary="001a113dde0a0baae8056683ae27"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/t-i0fyq0u6GEtOa-obmqsD8EAE0>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 03 Mar 2018 15:23:54 -0000
On Sat, Mar 3, 2018 at 12:09 AM, Ryan Hamilton < rch=40google.com@dmarc.ietf.org> wrote: > I'm sorry if this is a dumb question, but I understood that in IPv6 > routers could not fragment IPv6 packets, only endpoints. > > I know! fun. Honestly the Internet is just so interesting - you wonder how it works at all. The pcap is in the #picoquic slack channel. (as always anyone reading this can just ask anyone on the slack, like me, for an invite). search for octopus.pcap > Unlike in IPv4, IPv6 routers never fragment IPv6 packets. Packets > exceeding the size of the maximum transmission unit of the destination link > are dropped and this condition is signaled by a Packet too Big ICMPv6 type > 2 message to the originating node, similarly to the IPv4 method when the > Don't Fragment bit is set.[1] > > End nodes in IPv6 are expected to perform path MTU discovery to determine > the maximum size of packets to send, and the upper-layer protocol is > expected to limit the payload size. However, if the upper-layer protocol is > unable to do so, the sending host may use the Fragment extension header in > order to perform end-to-end fragmentation of IPv6 packets. > > https://en.wikipedia.org/wiki/IPv6_packet#Fragmentation > > > How sure are you that it's a router and not the sending host that's doing > the fragmentation. > > It seems unlikely to be in the core.. the recv host does have a mtu of 1500 but its hard to imagine a recv stack fragmenting (and then reassembling and reordering!) things. one of the interesting tidbits here is that it isn't just the small fragment that moves ahead in the queue - its both fragments of the big packet. > Cheers, > > Ryan > > On Fri, Mar 2, 2018 at 9:02 PM, Christian Huitema <huitema@huitema.net> > wrote: > >> Yesterday, I was mentioning bugs of the interop. This morning, I woke up >> to find an interesting message from Patrick McManus. Something is weird, he >> said. The first data message that your server sends, with sequence number >> N, always arrives before the final handshake message, with sequence number >> N-1. That inversion appears to happen systematically. >> >> It took us the best part of a day to explore blind alleys and finally >> understand what was happening. The exchange was over IPv6. Upon receiving a >> connection request from Patrick’s implementation, Picoquic was sending back >> a handshake packet. Immediately after that, Picoquic was sending its first >> data packet, which happens to be an MTU probe. And it turns out that the >> probe was 1518 bytes, a bit longer than what the AWS routers could accept. >> So some router inserted an IPv6 fragmentation header and split the packet >> in two: a large initial fragment, 1496 byte long, and a small second >> fragment 78 bytes long. You could think that this is no big deal, since >> fragments would just be reassembled at the destination, but you would be >> wrong. >> >> Some routers on the path try to be helpful. They have learned from past >> experience that short packets often carry important data, and so they try >> to route them faster than long data packets. And here is what happens in >> our case: >> >> · * The server prepares and send a Handshake packet, 590 bytes >> long. >> >> · * The server then prepares the MTU probe, 1518 bytes long. >> >> · * The MTU probe is split into fragment 1, 1496 bytes, and >> fragment 2, 78 bytes. >> >> · * The handshake and the long fragment are routed on the normal >> path, but the small fragment is routed at a higher priority level. >> >> · * The Linux driver at the destination receives the small >> fragment first. It queues everything behind that until it receives the long >> fragment. >> >> · * The Linux driver passes the reassembled packet to the >> application, which cannot do anything with it because the encryption keys >> can only be obtained from the handshake packet. >> >> · * The Linux driver then passes the handshake packet to the >> application. >> >> Which confirms an old opinion. When routers try to be smart and helpful, >> they end up being dumb and harmful. Please just send the packets in the >> order you get them! >> >> I tried to work around the issue by setting the "don't fragment" bit on >> the socket, but somehow that doesn't work. So I simply programmed the >> server to not use payloads larger than 1440 bytes. Still, I can see that >> pattern happening in other circumstances, such as a long Connection Initial >> message followed by a short 0-RTT packet. isn't networking fun? >> >> -- Christian Huitema >> >> >
- Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Ryan Hamilton
- Re: Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Ryan Hamilton
- Re: Fun and surprises with IPv6 fragmentation Mikkel Fahnøe Jørgensen
- Re: Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Patrick McManus
- RE: Fun and surprises with IPv6 fragmentation Praveen Balasubramanian
- Re: Fun and surprises with IPv6 fragmentation Eggert, Lars
- Re: Fun and surprises with IPv6 fragmentation Erik Kline
- Re: Fun and surprises with IPv6 fragmentation Mikkel Fahnøe Jørgensen
- RE: Fun and surprises with IPv6 fragmentation Lubashev, Igor
- Re: Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Ryan Hamilton