Re: Never fragment: getting PMTU info transmitted reliably

Brian E Carpenter <brian.e.carpenter@gmail.com> Thu, 17 January 2019 02:56 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3CE51130F4C for <ipv6@ietfa.amsl.com>; Wed, 16 Jan 2019 18:56:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RrIkPJCjD6vm for <ipv6@ietfa.amsl.com>; Wed, 16 Jan 2019 18:56:37 -0800 (PST)
Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 05422126CB6 for <ipv6@ietf.org>; Wed, 16 Jan 2019 18:56:37 -0800 (PST)
Received: by mail-pg1-x52c.google.com with SMTP id d72so3734738pga.9 for <ipv6@ietf.org>; Wed, 16 Jan 2019 18:56:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=tZ8ewoBlp8+qGTzHsYTqjgpbj+dsPJDOIyAmiQukLrY=; b=A/EBwaF7Z9pu3phNwtURQPfssKzvcyLTpsQJDiSJEn4rCCC76hoRAKc4E6gHFj0oR4 dkGCs85IZP7qLGTp+vGO549BLOn8ejkgd5qBeF5PjxOQTXhb4dhcTGvuVJ/Z4NuFsMGC FOnjBEGcATp6oPVUTDPVN78VgnuV3vjMKbCN7Pp0fC83tMQVElH0O0pzitX0U5EZtO3J y+zml8zZWJcACkNqF1awv5iTSEX75DORPLgijpnplMwRmhPj144bUf3gj9RnjN31mxXd D1mOfFJLDIc4IyMv27eF6qNX94xCkEfq6HVRh+LX6YCdpEXyZmwcS/uC5yJhuhTmb3Tb 9asg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=tZ8ewoBlp8+qGTzHsYTqjgpbj+dsPJDOIyAmiQukLrY=; b=BIn422TL45uG/KYNYO7aie6v2VEoGmXuAjusF/8hotcCtfEQHofeRBaR8zbzH23dXa YljFvvGMHA/M/fAbMytsFWe8y4cxhy9baDiCNwYOQmnLZ85kQY0qaGpgZPJh68vrRXla Mqb11lO2iQ9VkDwok5mWnSfMtzonIdOssqR6sPVNyM5PPOOzbgVKIMJO31AMKme5hGbR b9EWt/OvhWQ1gLziys/rL5a7hd+SNW9ZQofFTehmsCNUuODdXnjEaFHT2s4ZEIYqH/OP GNKYpxzjFX1nNEkG9hb9HBlpD4tj+NAXkheBSwLqY/lAfK8VzjdUD1TnL9qG2qq0EhTL N3ZA==
X-Gm-Message-State: AJcUukfcKRnUY2jDSkcrDL5dLh8dTeJru/L8gwx4IUD7XLAqcxhkT2Rj y/qISFcg9mE62XxvkkFMT266UFMm
X-Google-Smtp-Source: ALg8bN4r6nqHJUiVez0OG6/M3B7RbSVDlzm+d8R7J5d2Tewx0opfOj55VuBAfIbp2kb0s11kuVfnfg==
X-Received: by 2002:a62:130c:: with SMTP id b12mr13184888pfj.247.1547693795842; Wed, 16 Jan 2019 18:56:35 -0800 (PST)
Received: from [192.168.178.30] ([118.148.79.176]) by smtp.gmail.com with ESMTPSA id l70sm212343pgd.20.2019.01.16.18.56.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 18:56:35 -0800 (PST)
Subject: Re: Never fragment: getting PMTU info transmitted reliably
To: Michael Richardson <mcr+ietf@sandelman.ca>
Cc: "Joel M. Halpern" <jmh@joelhalpern.com>, IPv6 List <ipv6@ietf.org>
References: <CAOSSMjV0Vazum5OKztWhAhJrjLjXc5w5YGxdzHgbzi7YVSk7rg@mail.gmail.com> <6aae7888-46a4-342d-1d76-10f8b50cebc4@gmail.com> <EC9CC5FE-5215-4105-8A34-B3F123D574B9@employees.org> <4c56f504-7cd7-6323-b14a-d34050d13f4e@foobar.org> <9E6D4A6E-8ABA-4BAB-BEC5-969078323C96@employees.org> <CAAedzxpdF+yhBXfnwUcaQb-HkgdaqXRU3L+S7v8sS1F0OkwM9A@mail.gmail.com> <78a8a0e0-8808-364c-41f7-f81f90362432@gont.com.ar> <CAAedzxpjxhP0nOZVU0CTwA1u3fsPFthrJASjDEfnLcRNvr2gBQ@mail.gmail.com> <c9be798e-5a32-7c3e-a948-9ca2fab30411@si6networks.com> <CAHw9_i+M2-420pykp99LcgMNSG=eeDqsZK8+hN20t_uUdANHfA@mail.gmail.com> <d6e52c30-bbd1-1ee7-144c-fa13a9df5f38@gmail.com> <0f4a6c88-1def-6766-235b-1bcd2cc5e33b@si6networks.com> <CAHw9_i+FB-tb8c+G22FCUxNg9BDpMfwqur8gSn5QaXteBcABZA@mail.gmail.com> <3 eead7ba-dcb4-ed52-05bb-a41a5602f251@gmail.com> <14135.1547681760@localhost> <a044c327-d9ce-573e-a158-6c4b157f2d6c@joelhalpern.com> <d3ee03ad-bd24-f353-ddc9-c3cf8a4eb89b@gmail .com> <24583.1547692781@localhost>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Message-ID: <116fbbeb-c191-cd57-5998-1d80db1c9917@gmail.com>
Date: Thu, 17 Jan 2019 15:56:28 +1300
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0
MIME-Version: 1.0
In-Reply-To: <24583.1547692781@localhost>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/Zj1R65LJLN9EqCHMLPeodEtujUM>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2019 02:56:39 -0000

On 2019-01-17 15:39, Michael Richardson wrote:
> 
> Brian E Carpenter <brian.e.carpenter@gmail.com> wrote:
>     > On 2019-01-17 13:12, Joel M. Halpern wrote:
>     >> Just to clarify one aspect of the way entropy in path selection, I want
>     >> to point out a complication.
>     >>
>     >> It is not anywhere near enough to have as much entropy data as the
>     >> number of choices.  The problem is that you need enough randomness so
>     >> that you can expect a good distribution of flows.  And that even the
>     >> smaller number of larger flows will likely get distributed across the
>     >> choices.    Reducing the amount of available entropy can be quite
>     >> problematic.
> 
>     > Right. And for the server farm case, I don't think it's science fiction
>     > these days to think about hundreds or thousands of servers. Also, if the
>     > load sharing algorithm attempts to ensure that a given server has only
>     > one big job at a time, then a high collision rate in the hash can
>     > defeat it. A form of the birthday paradox applies: not "what is the
>     > chance of a clash per flow" but "what is the chance that out of a
>     > thousand servers, one of them gets two big jobs at the same time"?
> 
> Based upon my reading of the netflix blogs, they have experiemented
> extensively with the load sharing, and they really don't care about
> flow-labels in their decision process. (Of course, because IPv4 has
> no such things)

Indeed, but that's exactly why we brought in a load sharing expert
to help us with RFC7098. And there are residual problems even in the
ideal world where the flow label is perfect. We played with some ideas
in https://tools.ietf.org/html/draft-tarreau-extend-flow-label-balancing
but it didn't really go anywhere. In a nutshell, what's really needed
is a bidirectional session ID, not a unidirectional flow ID. And
that's not a layer 3 concept.

   Brian

> 
> It's about how fresh the (disk read) caches on the servers are, what content
> is being streamed, and other things that have nothing to do with the
> packets themselves.
> 
> Architecturally with IPv6, if you have an entire /64 (or more) to play with
> and you can statelessly forward packets at wire speed,  then there are
> other interesting off-path choices one can do.  (For instance, assign
> new server/128 for each client connection, and then when the connection
> arrives, dynamically map it to a particular server.  This pushes the state
> storage from layer-4 to the neighbour cache, which might not be a win)
> 
> So I seriously question whether any of this matters to server farms.
> 
>     > I am strongly against breaking the flow label just at the time when
>     > the major o/s are starting to set it correctly.
> 
> :-)
> 
>     > I'm all for fixing the fragmentation problem;
>     > draft-ietf-intarea-frag-fragile
>     > exists for a reason. But not by breaking something else.
> 
> My quick read says that it looks great to me.
> 
> Again, I don't really think that using the flow label to seed PLPMTUD
> is much of a win, but if it did provide something useful, I think it could be
> done without too much harm.
> 
> To reiterate: I don't think the benefit is high enough to warrant the
>    risk, despite the fact that I don't think the risk is as high as you
>    are suggesting.
> 
> --
> ]               Never tell me the odds!                 | ipv6 mesh networks [
> ]   Michael Richardson, Sandelman Software Works        |    IoT architect   [
> ]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [
> 
> 
> --
> Michael Richardson <mcr+IETF@sandelman.ca>, Sandelman Software Works
>  -= IPv6 IoT consulting =-
> 
> 
>