Re: [tcpm] PLPMTUD for all protocols

William Herrin <bill@herrin.us> Thu, 29 March 2018 16:58 UTC

Return-Path: <bill@herrin.us>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8643212DA1C for <tcpm@ietfa.amsl.com>; Thu, 29 Mar 2018 09:58:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ym1P8cph7OfB for <tcpm@ietfa.amsl.com>; Thu, 29 Mar 2018 09:58:40 -0700 (PDT)
Received: from magic.dirtside.com (magics.dirtside.com [199.33.225.25]) by ietfa.amsl.com (Postfix) with ESMTP id 88D39127522 for <tcpm@ietf.org>; Thu, 29 Mar 2018 09:58:40 -0700 (PDT)
Received: from minoc.dirtside.com (minoc.dirtside.com [70.184.240.82]) by magic.dirtside.com (8.14.3/) with ESMTP id w2TGwTEW011504 for <tcpm@ietf.org>; Thu, 29 Mar 2018 12:58:30 -0400
X-Really-To: <tcpm@ietf.org>
Received: from mail-pg0-f44.google.com (mail-pg0-f44.google.com [74.125.83.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by minoc.dirtside.com (Postfix) with ESMTPSA id 99E8EEBF62 for <tcpm@ietf.org>; Thu, 29 Mar 2018 12:58:29 -0400 (EDT)
Received: by mail-pg0-f44.google.com with SMTP id i194so3446098pgd.0 for <tcpm@ietf.org>; Thu, 29 Mar 2018 09:58:29 -0700 (PDT)
X-Gm-Message-State: AElRT7FUTNDMNp91ZtN8siVVSCJomKrqBrujo0XpWRytU0NnIU4i4GgQ 6J9R3QOrL37AV6Q5SpHU4IpDGX6Xygai0GVNmFw=
X-Google-Smtp-Source: AIpwx4/iT2hOzKL05BXu0s46EVLsdMP8Cp6eWdK5PHzxK79MI5XLnehIX/G9Ujc/9jKDIzcB0amLPc4S8fK3LbkeXOw=
X-Received: by 10.99.104.131 with SMTP id d125mr6224334pgc.9.1522342708591; Thu, 29 Mar 2018 09:58:28 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.100.183.67 with HTTP; Thu, 29 Mar 2018 09:58:07 -0700 (PDT)
In-Reply-To: <alpine.DEB.2.20.1803291727100.20609@uplift.swm.pp.se>
References: <alpine.DEB.2.20.1803281034310.20609@uplift.swm.pp.se> <CAP-guGVEcnk09yi8sz+fmghpeb91Y8tQb+LsEmSF+0e+f6oGhw@mail.gmail.com> <alpine.DEB.2.20.1803281747020.20609@uplift.swm.pp.se> <B323BA47-32DF-498B-9D1F-5A378E7ABB98@strayalpha.com> <CAP-guGXfds60dwJXgnSYdbtQ-qOCJGpiw1kcYUkJR8L3jwsFSA@mail.gmail.com> <alpine.DEB.2.20.1803291727100.20609@uplift.swm.pp.se>
From: William Herrin <bill@herrin.us>
Date: Thu, 29 Mar 2018 12:58:07 -0400
X-Gmail-Original-Message-ID: <CAP-guGWhyS0UW9YwsQyA10iN_M=yc4H8NqKAgWH=zPBFhscJxQ@mail.gmail.com>
Message-ID: <CAP-guGWhyS0UW9YwsQyA10iN_M=yc4H8NqKAgWH=zPBFhscJxQ@mail.gmail.com>
To: Mikael Abrahamsson <swmike@swm.pp.se>
Cc: Joe Touch <touch@strayalpha.com>, tcpm@ietf.org
Content-Type: text/plain; charset="UTF-8"
X-Spam-Checker: magic.dirtside.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/DCzNv4zUlsQlgUrkCElgvbeUFUQ>
Subject: Re: [tcpm] PLPMTUD for all protocols
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Mar 2018 16:58:43 -0000

On Thu, Mar 29, 2018 at 11:31 AM, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> On Thu, 29 Mar 2018, William Herrin wrote:
>> It's also fair to wonder why router and firewall vendors don't offer
>> configurations designed to mitigate the PMTUD black hole problem. Firewall
>> vendors could warn on attempts to block ICMP or require explicit overrides
>> to block ICMP destination unreachable messages. Router vendors could offer a
>> configuration option to issue ICMP destination unreachables from the a
>> configured public IP address rather than from the interface's (sometimes
>> private) IP address. They don't do these things.
>
> These are some reasons for PMTUD blackhole, but far from all. Some are
> caused by anycast configurations where the ICMP packet simply doesn't make
> it to the node that needs it. Load balancers doing the wrong thing.

Hi Mikael,

Load balancers doing the wrong thing seems to me like some combination
of implementation error and configuration error.

How would PLPMTUD detect a path MTU black hole for DNS response
packets from an anycasted node? It's fire and forget; the server is
not expecting another packet from the querent.

Where something more complicated is done with anycast such as TCP
without suitable modifications (e.g.
https://bill.herrin.us/network/anycasttcp.html) then they've already
accepted that a portion of their prospective users will be
unreachable. I have no sympathy when folks decide to intentionally
implement breakage because it will, statistically, only be broken some
of the time.

You make an interesting point here about icmp unreachables in the TCP
anycast scenario that I haven't accounted for in that paper. I'll have
to think that through.


> My take on this is that operators should do the right thing (make PMTUD
> work), and protocols should handle when things go wrong.

Operators "should" have the tools which:

A) make it practical to make PMTUD work in all commonly deployed
architectures and

B) do a respectable job preventing "design induced operator error"
with respect to PMTUD.

Currently shipping routers and firewalls mostly fail both criteria.


> I have very little problem with TCP using an extremely stupid algorithm as
> in "I'll use PMTUD, but if it doesn't work I'm going to try 576/1280, and if
> that works I sometimes go back to PMTUD value to see if things have
> improved". This still means it'll work, somewhat.

I have no problem with falling back after a smart enough delay/retry
cycle concludes a very high probability a PMTUD black hole. I have
little problem with those connections suffering the performance hit of
small packets for the duration. I have a bigger problem if routine
packet loss or a brief path outage gets commonly misdetected as a
PMTUD black hole. I worry most that allowing the algorithm to the hunt
for the MTU will justify allowing a higher false positive rate for
black hole detection so that everybody suffers a performance hit even
when there are no black holes. There are no obvious ways to implement
that hunt which don't harm performance. I'd rather see it fall back
and stay back with an OS tunable for how confident it has to be of a
black hole before it falls back.

Regards,
Bill Herrin



-- 
William Herrin ................ herrin@dirtside.com  bill@herrin.us
Dirtside Systems ......... Web: <http://www.dirtside.com/>