Never fragment: getting PMTU info transmitted reliably

Michael Richardson <mcr+ietf@sandelman.ca> Wed, 16 January 2019 23:36 UTC

Return-Path: <mcr+ietf@sandelman.ca>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 03A67128BCC for <ipv6@ietfa.amsl.com>; Wed, 16 Jan 2019 15:36:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F2Blrxdn_MOo for <ipv6@ietfa.amsl.com>; Wed, 16 Jan 2019 15:36:01 -0800 (PST)
Received: from tuna.sandelman.ca (tuna.sandelman.ca [IPv6:2607:f0b0:f:3:216:3eff:fe7c:d1f3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 75752127598 for <ipv6@ietf.org>; Wed, 16 Jan 2019 15:36:01 -0800 (PST)
Received: from sandelman.ca (unknown [IPv6:2607:f0b0:f:2:56b2:3ff:fe0b:d84]) by tuna.sandelman.ca (Postfix) with ESMTP id 4A95D3808A; Wed, 16 Jan 2019 18:35:23 -0500 (EST)
Received: by sandelman.ca (Postfix, from userid 179) id 3013B1CD5; Wed, 16 Jan 2019 18:36:00 -0500 (EST)
Received: from sandelman.ca (localhost [127.0.0.1]) by sandelman.ca (Postfix) with ESMTP id 2EB6DAE5; Wed, 16 Jan 2019 18:36:00 -0500 (EST)
From: Michael Richardson <mcr+ietf@sandelman.ca>
To: IPv6 List <ipv6@ietf.org>
Subject: Never fragment: getting PMTU info transmitted reliably
In-Reply-To: <3eead7ba-dcb4-ed52-05bb-a41a5602f251@gmail.com>
References: <CAOSSMjV0Vazum5OKztWhAhJrjLjXc5w5YGxdzHgbzi7YVSk7rg@mail.gmail.com> <973A1649-55F6-4D97-A97F-CEF555A4D397@employees.org> <CALx6S34YbBe8xBod3VsWVO33TpZcdxh2uV1vaO8Z_NKnVXp66g@mail.gmail.com> <A3C3F9C0-0A07-41AF-9671-B9E486CB8246@employees.org> <AEA47E27-C0CB-4ABE-8ADE-51E9D599EF8F@gmail.com> <6aae7888-46a4-342d-1d76-10f8b50cebc4@gmail.com> <EC9CC5FE-5215-4105-8A34-B3F123D574B9@employees.org> <4c56f504-7cd7-6323-b14a-d34050d13f4e@foobar.org> <9E6D4A6E-8ABA-4BAB-BEC5-969078323C96@employees.org> <CAAedzxpdF+yhBXfnwUcaQb-HkgdaqXRU3L+S7v8sS1F0OkwM9A@mail.gmail.com> <78a8a0e0-8808-364c-41f7-f81f90362432@gont.com.ar> <CAAedzxpjxhP0nOZVU0CTwA1u3fsPFthrJASjDEfnLcRNvr2gBQ@mail.gmail.com> <c9be798e-5a32-7c3e-a948-9ca2fab30411@si6networks.com> <CAHw9_i+M2-420pykp99LcgMNSG=eeDqsZK8+hN20t_uUdANHfA@mail.gmail.com> <d6e52c30-bbd1-1ee7-144c-fa13a9df5f38@gmail.com> <0f4a6c88-1def-6766-235b-1bcd2cc5e33b@si6networks.com> <CAHw9_i+FB-tb8c+G22FCUxNg9BDpMfwqur8gSn5QaXteBcABZA@mail.gmail.com> <3 eead7ba-dcb4-ed52-05bb-a41a5602f251@gmail.com>
X-Mailer: MH-E 8.6; nmh 1.7+dev; GNU Emacs 24.5.1
X-Face: $\n1pF)h^`}$H>Hk{L"x@)JS7<%Az}5RyS@k9X%29-lHB$Ti.V>2bi.~ehC0; <'$9xN5Ub# z!G,p`nR&p7Fz@^UXIn156S8.~^@MJ*mMsD7=QFeq%AL4m<nPbLgmtKK-5dC@#:k
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha256"; protocol="application/pgp-signature"
Date: Wed, 16 Jan 2019 18:36:00 -0500
Message-ID: <14135.1547681760@localhost>
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/C8C5hdW2pTkQD7qH_Y4CGFGrkTw>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jan 2019 23:36:04 -0000

Warren writes about putting MTU info into flow Label:
    >> to signal up to a 9K MTU, we would need 13bits (LN(9K-1280)/LN(2) =
    >> 12.9).
    >> 20 - 13 gives 7 bits (128) for the hash entropy. "7 bits of entropy,
    >> evenly distributed, should be enough for anyone", he said, hoping
    >> no-one points at
    >> the obvious correlation to 640K of RAM...

Brian E Carpenter <brian.e.carpenter@gmail.com> wrote:
    > I guess it all depends what you expect from the entropy. 20 bits gives
    > you a 1-in-a-million chance of a clash. 7 bits gives you a 1-in-128 chance
    > of a clash. This probably doesn't matter for a simple ECMP or LAG kind
    > of load sharing, but who's to say it doesn't matter for some more
    > fancy kind of sharing across a large array of servers?

You'd have to have more than 128 servers and/or paths.
Maybe one of our SPRING people in this WG can tell us if that's a real
problem today.  Obviously, we can't guess if it will be bad in the future,
but if it's a problem today, then we can know that immediately.

Now when I pushed for draft-ietf-6man-rfc6434-bis-09 and RFC8200 to say
that PLMTUD to be made MUST, I got various push backs that amounted to:
  1) we don't have enough evidence yet.
  2) it doesn't work for UDP and other traffic.

(1) turned out to be a real issue. I thought that some of the big players
    could easily get, or already would have, that kind of evidence.
    (Linux does not ship with PLPMTUD on by default.  If you want it, btw,
    sysctl -w net.ipv4.tcp_mtu_probing=2.  Yes. ipv4. It affects both)
    Turns out I was told that they always set their TCP segment size such that
    they likely will never fragment for v4 or v6, because due to hardware
    Transit Offload, the cost of missing a tx-op exceeds the benefit of
    making the packet slightly bigger.  I suspect that this is true in
    general to UDP and QUIC traffic too.
    I imagine a next generation 10G NICs might offer QUIC offload, including
    doing the crypto.  That's what I'd be coding if I worked in that space.

2) I will admit that I personally don't care that much about UDP traffic,
   except in that it lets me run IPsec through NAT44s.   I know that I use
   QUIC and WebRTC regularly, and that corporate enterprise users gets
   screwed by lack of UDP regularly.

So I think question is: is there really a problem that needs to be solved?
Maybe I will have to back and read the beginning of this thread again to
recall what the issue was.   Does this belong in SPUD?

I wish that SCTP had flown higher...

--
Michael Richardson <mcr+IETF@sandelman.ca>, Sandelman Software Works
 -= IPv6 IoT consulting =-