Re: [v6ops] IPv6 MTU Flow-label.... (related to draft-v6ops-pmtud-ecmp-problem-01)

Mark ZZZ Smith <markzzzsmith@yahoo.com.au> Tue, 11 November 2014 22:47 UTC

Return-Path: <markzzzsmith@yahoo.com.au>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0DE291A1C06 for <v6ops@ietfa.amsl.com>; Tue, 11 Nov 2014 14:47:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.1
X-Spam-Level:
X-Spam-Status: No, score=0.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, FROM_LOCAL_NOVOWEL=0.5, HK_RANDOM_ENVFROM=0.001, HK_RANDOM_FROM=0.998, J_CHICKENPOX_22=0.6, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pSM7pbEVzKoV for <v6ops@ietfa.amsl.com>; Tue, 11 Nov 2014 14:47:55 -0800 (PST)
Received: from nm41-vm3.bullet.mail.ne1.yahoo.com (nm41-vm3.bullet.mail.ne1.yahoo.com [98.138.120.219]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2E8DF1A1BC4 for <v6ops@ietf.org>; Tue, 11 Nov 2014 14:47:52 -0800 (PST)
Received: from [127.0.0.1] by nm41.bullet.mail.ne1.yahoo.com with NNFMP; 11 Nov 2014 22:47:51 -0000
Received: from [98.138.100.102] by nm41.bullet.mail.ne1.yahoo.com with NNFMP; 11 Nov 2014 22:44:49 -0000
Received: from [66.196.81.170] by tm101.bullet.mail.ne1.yahoo.com with NNFMP; 11 Nov 2014 22:44:49 -0000
Received: from [98.139.212.214] by tm16.bullet.mail.bf1.yahoo.com with NNFMP; 11 Nov 2014 22:44:49 -0000
Received: from [127.0.0.1] by omp1023.mail.bf1.yahoo.com with NNFMP; 11 Nov 2014 22:44:49 -0000
X-Yahoo-Newman-Property: ymail-4
X-Yahoo-Newman-Id: 525090.28162.bm@omp1023.mail.bf1.yahoo.com
Received: (qmail 69028 invoked by uid 60001); 11 Nov 2014 22:44:49 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com.au; s=s1024; t=1415745889; bh=439rTfNnMN33X+cnmNdEEtP/OhuUbCwE+ZAwSVdRKdI=; h=References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=uoObgGKFRKqPrhMPrvvDB+xm+mbHVvhm5p8z8SI+7GBEtSrfJqdWwPXPARNae5qhYReACL+3xlw6fsVZmDonS07VsVtTZ8yDC5f+xIiPmf1wDfTzgyQtb28eyIYfpMEE8lnfIZKZP5FWb4EwwDx3DWQLXANTNnodbLNiDOvs0XA=
X-YMail-OSG: 2BWf_1IVM1nWM7CVPflawN3pGiKTaDQOa0sl6bwGQ2tlXHD A2SPXaCdtEYRANLFweh1zHA7XXL9myUTmihYafGG0Cf4M06_9SLPvkOPazLh xWXkN5sxn3Z9MY97PPXTlopQGKY.eRsApAB4YVzcQAY6U_uk63eFhYAWqsFx yi02cQFeCu3sHxeTz7yVKHUKMi2C8TkYLlJQhbXxdPs91y6q6nirtEXX059a t26izKWjHtRfvdf8JXe5Xi45uzzZZBfIP2Fs7JnxwbyLEF17Y_XP9j_iY1ey 5kRK.SwQLQzQm3_GScmP3owGcsiXfXvapAJDEsK4qzemxo39hT5ZyasLvTaf cfALicMrL9GNQdDp2zv7C8bCdtb4Szaqy3u6w0vpoSerI0W7Vsu4L3aeW1kT inhL6QC31deqpYC0zHN.Ocjb0DYsPz75PrcBc7fld47tHHLhwvfFA8PZ6656 f1hNVkC_bo90vCwffHvmFXnr4q2y5pUcIPzALYfcMEs8nAP8JFdNTyLZ47O2 XHAI6jDfb9Lig5GOqmtzpVZWuMHQyudLIfNpzZAVR2PE.maqTaWvlkvuddSi ya623i5YE1vJrDWE6gTNzBYg-
Received: from [150.101.221.237] by web162201.mail.bf1.yahoo.com via HTTP; Tue, 11 Nov 2014 14:44:49 PST
X-Rocket-MIMEInfo: 002.001, CgoKCgo.X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KPiBGcm9tOiBKZXJvZW4gTWFzc2FyIDxqZXJvZW5AbWFzc2FyLmNoPgo.VG86IEJyaWFuIEUgQ2FycGVudGVyIDxicmlhbi5lLmNhcnBlbnRlckBnbWFpbC5jb20.IAo.Q2M6IElQdjYgT3BlcmF0aW9ucyA8djZvcHNAaWV0Zi5vcmc.OyA2bWFuIDxpcHY2QGlldGYub3JnPiAKPlNlbnQ6IFR1ZXNkYXksIDExIE5vdmVtYmVyIDIwMTQsIDY6MDIKPlN1YmplY3Q6IFJlOiBbdjZvcHNdIElQdjYgTVRVIEZsb3ctbGFiZWwuLi4uIChyZWxhdGVkIHQBMAEBAQE-
X-Mailer: YahooMailWebService/0.8.203.733
References: <54609418.5030503@massar.ch> <54609A3E.3030106@massar.ch> <54610AA7.6070100@gmail.com> <54610BD4.2070600@massar.ch>
Message-ID: <1415745889.58663.YahooMailNeo@web162201.mail.bf1.yahoo.com>
Date: Tue, 11 Nov 2014 14:44:49 -0800
From: Mark ZZZ Smith <markzzzsmith@yahoo.com.au>
To: Jeroen Massar <jeroen@massar.ch>, Brian E Carpenter <brian.e.carpenter@gmail.com>
In-Reply-To: <54610BD4.2070600@massar.ch>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Archived-At: http://mailarchive.ietf.org/arch/msg/v6ops/z1ktTUAlyU2uQetrRoOHHmjnlk0
Cc: IPv6 Operations <v6ops@ietf.org>, 6man <ipv6@ietf.org>
Subject: Re: [v6ops] IPv6 MTU Flow-label.... (related to draft-v6ops-pmtud-ecmp-problem-01)
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: Mark ZZZ Smith <markzzzsmith@yahoo.com.au>
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Nov 2014 22:47:57 -0000





>________________________________
> From: Jeroen Massar <jeroen@massar.ch>
>To: Brian E Carpenter <brian.e.carpenter@gmail.com> 
>Cc: IPv6 Operations <v6ops@ietf.org>; 6man <ipv6@ietf.org> 
>Sent: Tuesday, 11 November 2014, 6:02
>Subject: Re: [v6ops] IPv6 MTU Flow-label.... (related to draft-v6ops-pmtud-ecmp-problem-01)
> 
>
>On 2014-11-10 19:57, Brian E Carpenter wrote:
>> Er, no, you really can't use the flow label like that. RFC6437.
>
>Hence, redefining the bits :) Few implementations use them anyway, thus
>we still have time to actually make good use of them.
>

So the "flow label" then field name wouldn't then be accurate, because an MTU value doesn't identify or contribute to the identification or labelling of a flow very well. There's a lot of troubleshooting and other value in "truth in advertising", or less cliché'd "accuracy in naming". Encoding non-flow labelling MTU information in that existing flow labelling field would therefore be a hack (MTU might be an attribute of a flow, but it isn't very specific or unique to a flow) for this one special case of stateless ECMP based load balancing. It wouldn't be useful in most or perhaps all other IP use cases. (So if this idea gained more traction, I'd start lobbying for the field to be renamed.)

It seems to me that the fundamental issue is that given how the IP protocols work (as this problem also occurs in IPv4), stateless ECMP isn't a very effective method of performing load balancing. While probably chosen for pure performance/throughput, it isn't distributing load to the LB hosts based on their current load (so it isn't actually 'balancing'), and because it isn't maintaining per-host state, isn't able to perform per-host specific processing/forwarding, which is what is needed for PMTUD to function in this scenario.

This/these methods of load balancing have generally annoyed me for a while, primarily because I have also had to troubleshoot or deal with these sorts of issues in the past. I think the fundamental issue is that to the network, the unicast LB address is looks like a single unicast destination/host, where as in reality is being shared by a number of hosts. While that might sound like the anycast problem, I think the difference is that even though with anycast multiple addresses are shared between a set of hosts, for any particular source host there is no variation in which of the anycast hosts is used over time, unless it fails (i.e., the routing system would normally consistently pick one of the anycast destinations for a particular source, as would IPv6's 'on-link' anycast method (see RFC7094 for more discussion)). In this LB case the destination LB host changes based on changes in TCP or UDP ports, which is occurring all of the time.

If high performance stateless and therefore load insensitive LB distribution is needed or wanted, then perhaps something more simple that doesn't use TCP or UDP port information might be better. For example, using SA+DA based forwarding in the LB, and then statically mapping portions of the SA address space to a particular LB host (e.g., 1/4 to LB host a, 1/4 to LB host b etc. for four LB hosts), with all LB hosts configured with the LB 'shared' unicast address (i.e., no NAT). If you want anything smarter than that, then I think the LB needs to become stateful, and start maintaining per-connection state.


>Note that I am specifically proposing setting the first four bits to 1
>aka 0xf so that the space can still be used for it's original intent too.
>
>> What you can do is use the flow label as intended, for ECMP or any
>> other kind of load balancing. RFC6438, RFC7098. They don't fix the
>> ICMPv6 problem though.
>
>It won't fix the 1-RTT issue that worries content providers either.
>
>That is the biggest problem for them: a ICMPv6 PTB is an extra round
>trip and they want speed. Hence including the MTU already in the packet.
>(though as noted in my reply, it won't work fully for async networks)
>
>> The key sentence in Joel's draft is
>> 
>>>    Because the PTB message is not identifiable as part of the
>>>    original flow by the packet header the results of the ICMPv6 ECMP
>>>    hash are unlikely to be hashed to the same nexthop as packets
>>>    matching TCP or UDP ECMP hash.
>> 
>> To fix that, I suspect that flow label reflection is needed.
>
>Won't work as that packet will not be matching the loadbalancers setup
>either.
>
>What would partially work is if the ICMPv6 PTB would copy the flow-label
>from the original flow that caused the PTB.
>
>But then still, large content providers will not be happy, as it does
>not satisfy the 1-RTT argument, which now causes them to just ignores
>PTBs and set the MSS to something tiny (thus in the end causing more
>packets, but at least all packets will go through).
>
>
>
>
>
>Greets,
>Jeroen
>
>_______________________________________________
>v6ops mailing list
>v6ops@ietf.org
>https://www.ietf.org/mailman/listinfo/v6ops
>
>
>
>