Re: [tcpm] [v6ops] Flow Label Load Balancing

Michael Tuexen <Michael.Tuexen@lurchi.franken.de> Sat, 28 November 2020 18:09 UTC

Return-Path: <Michael.Tuexen@lurchi.franken.de>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1667F3A0E08; Sat, 28 Nov 2020 10:09:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.499
X-Spam-Level:
X-Spam-Status: No, score=-1.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.399, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KcEylFFdyPxD; Sat, 28 Nov 2020 10:09:17 -0800 (PST)
Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9880F3A0DFC; Sat, 28 Nov 2020 10:09:15 -0800 (PST)
Received: from [IPv6:2a02:8109:1140:c3d:813c:9bb5:ab11:9982] (unknown [IPv6:2a02:8109:1140:c3d:813c:9bb5:ab11:9982]) (Authenticated sender: lurchi) by drew.franken.de (Postfix) with ESMTPSA id 6B18372246344; Sat, 28 Nov 2020 19:09:11 +0100 (CET)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.21\))
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <CALx6S34uCrA1QdvLV8fpRKaJGLWMgtCmBCnrsBjU3TS+kXUs3Q@mail.gmail.com>
Date: Sat, 28 Nov 2020 19:09:08 +0100
Cc: Fernando Gont <fernando@gont.com.ar>, IPv6 Operations <v6ops@ietf.org>, tcpm <tcpm@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <80E049A2-534C-4D89-B67D-4384A6CFA8B6@lurchi.franken.de>
References: <CAEGSd=DY8t8Skor+b6LSopzecoUUzUZhti9s0kdooLZGxPEt+w@mail.gmail.com> <d29042a7-742b-a445-cf60-2773e5515ae5@gont.com.ar> <CALx6S37+1duoNGR3dZWesHsZvx15kX9wCWufPMh=esvMaSMF_g@mail.gmail.com> <63e7aad3-7094-7492-dbe4-3eefb5236de3@gont.com.ar> <CALx6S37t4jump6S-R5_xdo5DF+RnHtT4rU5-RuiC-2GQ0PXxkQ@mail.gmail.com> <96b6d04b-e5bb-ba79-0281-e9599109be95@gont.com.ar> <CALx6S34uCrA1QdvLV8fpRKaJGLWMgtCmBCnrsBjU3TS+kXUs3Q@mail.gmail.com>
To: Tom Herbert <tom@herbertland.com>
X-Mailer: Apple Mail (2.3654.20.0.2.21)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/C5XIqni6Zn6sCOwJZB4ms-Tlzuk>
Subject: Re: [tcpm] [v6ops] Flow Label Load Balancing
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 28 Nov 2020 18:09:20 -0000


> On 28. Nov 2020, at 00:31, Tom Herbert <tom@herbertland.com> wrote:
> 
> On Fri, Nov 27, 2020 at 1:41 PM Fernando Gont <fernando@gont.com.ar> wrote:
>> 
>> Hello, Tom,
>> 
>> On 25/11/20 16:35, Tom Herbert wrote:
>>> Hi Fernando, comments in line...
>>> 
>>> On Wed, Nov 25, 2020 at 12:13 AM Fernando Gont <fernando@gont.com.ar> wrote:
>>>> 
>>>> Hi, Tom,
>>>> 
>>>> On 24/11/20 16:43, Tom Herbert wrote:
>>>> [....]
>>>>> Modulating the flow label is a means to affect the routing of packets
>>>>> through the network that uses flow labels as input to the ECMP hash.
>>>> 
>>>> What's the point?
>>>> 
>>>> 1) You cannot tell *if* the FL is being used.
>>>> 
>>> Generally true, but in a limited domain this information could be
>>> discerned. I'd note that it's also generally true that we don't know
>>> if there is a load balancer or stateful firewall in the path that
>>> requires consistent routing, but in a limited domain we could know
>>> that also.
>> 
>> Exactly. So my take is that the drawbacks for the general case outweigh
>> the benefits of the specific case.
>> 
>> 
>> 
>>>> 2) Changing the FL does not necessarily mean that packets will employ a
>>>> different link.
>>> 
>>> It's an opportunistic mechanism. If a connection is failing and we get
>>> a better path that fixes it by simply changing the flow label then
>>> what's the harm?
>> 
>> Complexity, and the possible negative implications for the general case.
>> 
>> That said, why don't you fix the problem where you should (i.e., routing)?
>> 
>> 
>> 
>>>> 3) If the network is failing, shouldn't you handle this via routing?
>>>> 
>>> Sure, but then that requires an out of band feedback loop from a TCP
>>> implementation to the network infrastructure to indicate there is a
>>> problem and then the network needs to respond.
>> 
>> Not really. TCP does not deal with paths (not should it). If the network
>> detects a link is failing, it should route around the problem.
>> 
> 
> Fernando,
> 
> All we know in this case is that *TCP* has detected a connection is
> failing and there may be a potential problem with a path, we don't
Hi Tom,

it is perfectly fine that if a TCP connection fails and the application
triggers a setup of a new TCP connection, the new TCP connections
uses a flowlabel different from the one used for the old TCP connection.
However, a single RTO does not indicate a connection failure.
Normally you would need several consecutive RTOs to happen before TCP
declares the connection to be failed.

Best regards
Michael
> know that the network has detected any problems like some link is
> failing or will ever detect any problem. So in such a scenario, what
> recourse does the host have to try to salvage its connections? If the
> answer is that the host should always patiently wait for the network
> to figure out what is happening, then that's going to be a hard sell
> to application and host stack developers (remember the network is a
> black box from their perspective that they really don't trust). If the
> answer is that the host is supposed to inform the network of issues
> with it's connections and then host and network work together to solve
> then that's great; but then what is the general protocol that has been
> established to facilitate that and solved the problem of how hosts and
> network work together in concert to solve user's problems?
> 
> Tom
> 
>> 
>> 
>>> That's significant
>>> infrastructure and higher reaction time than doing something in TCP
>>> and IP. Think of modulating the flow label is an inexpensive form of
>>> source routing within a limited domain that doesn't need any
>>> infrastructure or heavyweight protocols or something like segment
>>> routing.
>> 
>> Downsides:
>> 
>> * It breaks load balancing
>> * It depends on the source doing something -- but you don't necessarily
>> control the source -- and you cannot tell if the source implements it
>> * It doesn't solve the problem for nodes that do not behave as expected
>> 
>> My take is that if you really care about this, you're already (or
>> should!) be solving the problem in a different (and more general) way.
>> 
>> 
>> 
>>>>> The basic idea is that the flow label associated with a connection is
>>>>> randomly changed when the stack observes that the connection is
>>>>> failing (e.g. and an RTO). There is nothing in the specs that prevents
>>>>> this since the source is at liberty to set the flow label as it sees
>>>>> fit.
>>>> 
>>>> The FL is expected to remain constant for the life of a flow. A
>>>> retransmitted packet is part of the same flow as the
>>>> originally-transmitted packet. So this seems to be contradicting the
>>>> very specification of the FL.
>>>> 
>>>> For instance, If a RTO for a flow causes the FL to change, then one may
>>>> possibly argue that the FL is not naming/labeling what is said/expected
>>>> to be anming/labeling.
>>> 
>>> Specifically, RFC6437 states:
>>> 
>>> "It is therefore RECOMMENDED that source hosts support the flow label
>>> by setting the flow label field for all packets of a given flow to the
>>> same value chosen from an approximation to a discrete uniform
>>> distribution."
>>> 
>>> So that is clearly a just recommendation, and not a requirement (and
>>> definitely not a MUST). Furthermore, RFC6437 states:
>>> 
>>> "A forwarding node MUST either leave a non-zero flow label value
>>> unchanged or change it only for compelling operational security
>>> reasons as described in Section 6.1."
>> 
>> My concern is that we may leave the FL unusable.
>> 
>> 
>> 
>> 
>>> So there's no guarantee in the protocol specs that flow labels are
>>> consistent for the life of the connection, which means that the
>>> network cannot assume that and thus it would be incorrect if the
>>> network tried to enforce flow label consistency as a protocol
>>> requirement.
>> 
>> You're probably right in that respect. And, in retrospective, I think it
>> would have been better if we had specified the FL to be unmutable.
>> 
>> 
>> 
>>> As I said, it is prudent to try to be consistent with
>>> flow labels and the default behavior in Linux should be changed,
>> 
>> Sorry: is the current Linux behavior to modulate the FL?
>> 
>> 
>>> however I do not believe there's a valid claim of non-conformance that
>>> motivates removal of the feature that is already deployed.
>> 
>> I personally consider the changing the FL for a flow to be a bug. That
>> said, if the default setting is the correct/compliant behavior, at the
>> end of the day we're all adults, and whoever overrides the default
>> behavior is supposed to know what he/she does -- and if they don't, they
>> get what they deserve  ;-)
>> 
>> So, in that light, I wouldn't push to remove the optional behavior,
>> *provided* the default behavior is correct.
>> 
>> Thanks,
>> --
>> Fernando Gont
>> e-mail: fernando@gont.com.ar || fgont@si6networks.com
>> PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
>> 
>> 
>> 
> 
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm