Re: Fun and surprises with IPv6 fragmentation

Ryan Hamilton <rch@google.com> Tue, 13 March 2018 18:16 UTC

Return-Path: <rch@google.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ADF75127775 for <quic@ietfa.amsl.com>; Tue, 13 Mar 2018 11:16:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.709
X-Spam-Level:
X-Spam-Status: No, score=-2.709 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nrfghovlTgmu for <quic@ietfa.amsl.com>; Tue, 13 Mar 2018 11:16:39 -0700 (PDT)
Received: from mail-yw0-x231.google.com (mail-yw0-x231.google.com [IPv6:2607:f8b0:4002:c05::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8020F124C27 for <quic@ietf.org>; Tue, 13 Mar 2018 11:16:39 -0700 (PDT)
Received: by mail-yw0-x231.google.com with SMTP id l200so401253ywb.0 for <quic@ietf.org>; Tue, 13 Mar 2018 11:16:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=eVFKtOgslO6sVOzAenL+jEOpM48RtQavGfJdXkmY4i0=; b=L6NzPHHHwBZ1ZBTjUepYa9AadjV3VGFOsw40dDdxFPJfC/6PGH4eqTimKxyobaClYS GgenXtLHPlh6PhTbF6ZIio5tVYCdVlz9n0uRbx8ZoOo6e0jIEzlZ4BX3n3r3bAGA2aEx CFfvE1Txe//mkkLQzI0+WbbrmStRa5GXxX3ymYz5JaAkgU0Q1aa8EiXEGehzjd0UOmkQ vjJKKOBFGBjQc5SrczIPxd1DbpBKwKzPqd3z8mOqn74bPW/mjhd2uFz820wUSnl6lTMa XAbXN20QYs+NBhpXzB/ddam5etCZ5FSrXcjLBl6gGe6EuxsJnsfYqrTz3PUkincsHozC ARLQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=eVFKtOgslO6sVOzAenL+jEOpM48RtQavGfJdXkmY4i0=; b=IVQ45LKsMn9O8mkW62u2HrZjcu2Iy9ClI7rw9GDnOxu/zVfTWRKP++zU3pYJ4+R93V g6eruVn2Lv4maif8ZeCXJA2fjL2iszDNsneKQrRJOW6yTo+cgz6cdFAgMGvPOLspRkPS MCJx4FoPyUDoW8UJE3qQgvNqGI6YN6BPlqKLaCU+iGbQiRsLadDWgD9m/DujG6rnkw2g 1MdeLv3gOK6tW5Fb/eH2eQVFzGomxnVK6fgybQzS7pnYCQh1tv2WSsAaIama/FU+WpZu Z+gbxwuvMna+bIrSRmo3Ri2ACnL1svWBmugBQc99bOtbBogGDi9agSedhbkkkA6ZiWHH Ic5g==
X-Gm-Message-State: AElRT7E72bOWcYAHWHU1jl/3Ex8NFSgTfx9ZRmwCK9Jh2UgSFOstxURw dwnolmfoNImKil2MC7fIWBmgSjrjiUkFcTsM1gJzaQ==
X-Google-Smtp-Source: AG47ELs352XjRlI9ZZXcz0nq7uAMy0M/LfNBfclhN/spoTrz/O2V/a1F/ppsjoLZ+Gas49nVKatmE5d8/faA6jRfL8g=
X-Received: by 2002:a25:3781:: with SMTP id e123-v6mr1342717yba.247.1520964998470; Tue, 13 Mar 2018 11:16:38 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a25:98c5:0:0:0:0:0 with HTTP; Tue, 13 Mar 2018 11:16:37 -0700 (PDT)
In-Reply-To: <6eddf6ee-e9a9-da20-276c-c724a36bd33a@huitema.net>
References: <681fcc96-4cf9-100d-9ad6-b3c7be9189a5@huitema.net> <CAAedzxq5G_gAaBzdizv5x-yobomW5+8sSZ_1rn4ApGYpQZmpvg@mail.gmail.com> <CAN1APdcF0DeKQSDd_=CPbSJRuvSWQRF_K+EUpzCrp7OMytzvNw@mail.gmail.com> <1f0f0060365f4d60a73cf68b7344a271@usma1ex-dag1mb5.msg.corp.akamai.com> <6eddf6ee-e9a9-da20-276c-c724a36bd33a@huitema.net>
From: Ryan Hamilton <rch@google.com>
Date: Tue, 13 Mar 2018 11:16:37 -0700
Message-ID: <CAJ_4DfT9PsVOksksX79vNEw26nX4W0X4zO5MBF=Q1xhHdjUymg@mail.gmail.com>
Subject: Re: Fun and surprises with IPv6 fragmentation
To: Christian Huitema <huitema@huitema.net>
Cc: "Lubashev, Igor" <ilubashe=40akamai.com@dmarc.ietf.org>, "ek=40google.com@dmarc.ietf.org" <ek@google.com>, "mikkelfj@gmail.com" <mikkelfj@gmail.com>, "quic@ietf.org" <quic@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000073b9ba05674f4250"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/HcKkoyLBpEcT02HpgOIc1AJBDqM>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Mar 2018 18:16:42 -0000

I missed this summary when you sent it out earlier. Thanks for getting to
the bottom of this, and for the detailed explanation. Your conclusion fo
always set the DF option when sending makes a ton of sense. IP fragments
are ... not awesome!

On Sun, Mar 4, 2018 at 10:45 AM, Christian Huitema <huitema@huitema.net>
wrote:

> Thanks to all of you for the comments. I updated my blog entry on the
> subject (https://huitema.wordpress.com/2018/03/03/having-fun-and-
> surprises-with-ipv6/) with the following new conclusion:
>
> *It seems clear now that the fragmentation happened at the source, in the
> Linux kernel. This leaves one remaining issue, the out of order delivery.*
>
> *There are in fact two separate out of order delivery issues. One is
> having the second fragment arrive between the first one, and the second is
> having the MTU probe arrive before the previously sent Handshake packet.
> The inversion between the segments may be due to code in the Linux kernel
> that believes that sending the last segment first speeds up reassembly at
> the receiver. The inversion between the fragmented MTU probe and the
> Handshake packet has two plausible causes:*
>
>    - *Some router on the path may be forwarding the small fragment at a
>    higher priority level.*
>    - *Some routers may be implementing equal cost multipath, and then
>    placing fragmented packets and regular packets into separate hash buckets*
>
>
>
> *The summary for developers, and for QUIC in particular, is that we should
> really avoid triggering IPv6 fragmentation. It can lead to packet losses
> when NATs and firewalls cannot find the UDP payload type and the port
> numbers in the fragments. And it can also lead to out of order delivery as
> we just saw. And for my own code, the lesson is simple. I really need to
> set up the IPv6 Don’t Fragment option when sending MTU probes, per section
> 11.2 of RFC 3542 <https://tools.ietf.org/html/rfc3542#section-11.2>. *--
> Christian Huitema
>