Fun and surprises with IPv6 fragmentation
Christian Huitema <huitema@huitema.net> Sat, 03 March 2018 05:02 UTC
Return-Path: <huitema@huitema.net>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8A68C12EABD for <quic@ietfa.amsl.com>; Fri, 2 Mar 2018 21:02:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.34
X-Spam-Level:
X-Spam-Status: No, score=-2.34 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, HTML_OBFUSCATE_05_10=0.26, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i35HIKKymLmh for <quic@ietfa.amsl.com>; Fri, 2 Mar 2018 21:02:15 -0800 (PST)
Received: from mx43-out1.antispamcloud.com (mx43-out1.antispamcloud.com [138.201.61.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EA7BE12D93F for <quic@ietf.org>; Fri, 2 Mar 2018 21:02:14 -0800 (PST)
Received: from xsmtp05.mail2web.com ([168.144.250.245]) by mx66.antispamcloud.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.89) (envelope-from <huitema@huitema.net>) id 1erzJD-00052x-Td for quic@ietf.org; Sat, 03 Mar 2018 06:02:13 +0100
Received: from [10.5.2.52] (helo=xmail12.myhosting.com) by xsmtp05.mail2web.com with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.63) (envelope-from <huitema@huitema.net>) id 1erzJA-000452-GK for quic@ietf.org; Sat, 03 Mar 2018 00:02:09 -0500
Received: (qmail 7856 invoked from network); 3 Mar 2018 05:02:06 -0000
Received: from unknown (HELO [192.168.1.103]) (Authenticated-user:_huitema@huitema.net@[172.56.42.241]) (envelope-sender <huitema@huitema.net>) by xmail12.myhosting.com (qmail-ldap-1.03) with ESMTPA for <quic@ietf.org>; 3 Mar 2018 05:02:05 -0000
To: "quic@ietf.org" <quic@ietf.org>
From: Christian Huitema <huitema@huitema.net>
Message-ID: <681fcc96-4cf9-100d-9ad6-b3c7be9189a5@huitema.net>
Date: Fri, 02 Mar 2018 21:02:03 -0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="------------E4F776982F632820A23F70EF"
Content-Language: en-US
Subject: Fun and surprises with IPv6 fragmentation
X-Originating-IP: 168.144.250.245
X-AntiSpamCloud-Domain: xsmtpout.mail2web.com
X-AntiSpamCloud-Username: 168.144.250.0/24
Authentication-Results: antispamcloud.com; auth=pass smtp.auth=168.144.250.0/24@xsmtpout.mail2web.com
X-AntiSpamCloud-Outgoing-Class: unsure
X-AntiSpamCloud-Outgoing-Evidence: Combined (0.41)
X-Recommended-Action: accept
X-Filter-ID: EX5BVjFpneJeBchSMxfU5lnWw4vcxaboV5GNsIM/qTh602E9L7XzfQH6nu9C/Fh9KJzpNe6xgvOx q3u0UDjvO37pNwwF1lRXh5rzvPzo9Jts1ujulqUFmMITHM77eiViyS1QdAIDZudf714GezfksYyz NJVaeAWax4WOe4pTBX2DwIE7VKe+bqpcdCns72R1myoI6HG8RgZGnUdJnKT7IqXe0Of4jddu9xC8 8+iQ5nb6BRFVjXUbiREH8mlR1JtP/UPTAAMnNuB6/0WWjH77oh6ijzwzq5HGxV3pRhOdYuobeA2G NaAif0QyGEAJd8kel+zffa+S3paXsykGResyE7dAzbZabvf4+eAvvSn0D5YzxzA4C4+ILjmdkQoL 6F7cCSavQBrPoagEXfZ210Cx8bwqyT5p50x81ZKcmzCu2U1l0pLLr6Q2GfeLeJGF+80DiMuK19Gt kjSClxa7nkfgrjWKtLT9WR57oxUvRixjadcYp1AHUdwl/5y6S9ANRbCOrtTbk5SQKFzz0trkyx2O xKBWWrR8KrmPkEWZ/0XNjz+nOk/0hBU1wgZoxxx3xydCRzamoXmFFzOHqSgkz8qNlb0yK8nh4wUp PrQsuR74m7mi++sp8W+veGF1nw/XroQ7DZcsTd0S7nlAbLmVODWYnTBk19KAXoZr2QC+JQiZhSMO ufOwbl/5xojV0vlh7+TwwCnSGDac5irsZFPHqbnYAUK2imFSbHDjSfD3WSrzL59mYbIItf+/6PrN wMGmMn55t5ELrSovEbs0q5P3DsZz6Iz8waCda1qh4N7T4Zm+JMD1H/aAwarQpYDOYx/6JtUOfcO5 M3zJ5LV9FnW2JgnPwaHg3Vm1guBM52Xnl60Wt9LGBDlaFROy1SEvFz4VZsqCWyz6fZ9+dDG3ponu SVlpv88Q6/TV94H9fkLRM72riOLwtF7QVYQc1P3RKpQ8Ws9eHT2Opg3qxuGikK4seFV48jEvuGsl KTrRIXcXpFg5ivY=
X-Report-Abuse-To: spam@quarantine5.antispamcloud.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/MzUjsb5M5H5R3By5i6TeMkDCtCo>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 03 Mar 2018 05:02:17 -0000
Yesterday, I was mentioning bugs of the interop. This morning, I woke up to find an interesting message from Patrick McManus. Something is weird, he said. The first data message that your server sends, with sequence number N, always arrives before the final handshake message, with sequence number N-1. That inversion appears to happen systematically. It took us the best part of a day to explore blind alleys and finally understand what was happening. The exchange was over IPv6. Upon receiving a connection request from Patrick’s implementation, Picoquic was sending back a handshake packet. Immediately after that, Picoquic was sending its first data packet, which happens to be an MTU probe. And it turns out that the probe was 1518 bytes, a bit longer than what the AWS routers could accept. So some router inserted an IPv6 fragmentation header and split the packet in two: a large initial fragment, 1496 byte long, and a small second fragment 78 bytes long. You could think that this is no big deal, since fragments would just be reassembled at the destination, but you would be wrong. Some routers on the path try to be helpful. They have learned from past experience that short packets often carry important data, and so they try to route them faster than long data packets. And here is what happens in our case: · * The server prepares and send a Handshake packet, 590 bytes long. · * The server then prepares the MTU probe, 1518 bytes long. · * The MTU probe is split into fragment 1, 1496 bytes, and fragment 2, 78 bytes. · * The handshake and the long fragment are routed on the normal path, but the small fragment is routed at a higher priority level. · * The Linux driver at the destination receives the small fragment first. It queues everything behind that until it receives the long fragment. · * The Linux driver passes the reassembled packet to the application, which cannot do anything with it because the encryption keys can only be obtained from the handshake packet. · * The Linux driver then passes the handshake packet to the application. Which confirms an old opinion. When routers try to be smart and helpful, they end up being dumb and harmful. Please just send the packets in the order you get them! I tried to work around the issue by setting the "don't fragment" bit on the socket, but somehow that doesn't work. So I simply programmed the server to not use payloads larger than 1440 bytes. Still, I can see that pattern happening in other circumstances, such as a long Connection Initial message followed by a short 0-RTT packet. isn't networking fun? -- Christian Huitema
- Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Ryan Hamilton
- Re: Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Ryan Hamilton
- Re: Fun and surprises with IPv6 fragmentation Mikkel Fahnøe Jørgensen
- Re: Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Patrick McManus
- RE: Fun and surprises with IPv6 fragmentation Praveen Balasubramanian
- Re: Fun and surprises with IPv6 fragmentation Eggert, Lars
- Re: Fun and surprises with IPv6 fragmentation Erik Kline
- Re: Fun and surprises with IPv6 fragmentation Mikkel Fahnøe Jørgensen
- RE: Fun and surprises with IPv6 fragmentation Lubashev, Igor
- Re: Fun and surprises with IPv6 fragmentation Christian Huitema
- Re: Fun and surprises with IPv6 fragmentation Ryan Hamilton