Re: [v6ops] Flow Label Load Balancing

Tom Herbert <tom@herbertland.com> Fri, 27 November 2020 23:31 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4B2F53A03C9 for <v6ops@ietfa.amsl.com>; Fri, 27 Nov 2020 15:31:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6v6bqEeDmxuR for <v6ops@ietfa.amsl.com>; Fri, 27 Nov 2020 15:31:51 -0800 (PST)
Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 63ADE3A0365 for <v6ops@ietf.org>; Fri, 27 Nov 2020 15:31:51 -0800 (PST)
Received: by mail-ed1-x533.google.com with SMTP id y4so7256510edy.5 for <v6ops@ietf.org>; Fri, 27 Nov 2020 15:31:51 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=KHVcGF66BJ6131S4PRwX4pt0p5yKDCntZmdun1QTP5M=; b=mKwTKQTn6zil88N7l1F7J3cF+4dM+MN17z4N/yEcbwYwoaanjnzBDtrl0xIo/4BVCt a0CnB3ACIDyLbdewD6UQ+mg5g9l2mizdJ5Ouj4UlgCzCrZTOuF1D7ipQKeDCSxC7TKnf JcMTO1/2oCTPWOZljZjSbREwstus6i6fwryq7d5Rfr3PMv++XHm5zUKNb+xuxcpw3Fwv DcIDd8+aUVosbHz78rf6GAR91ctIgZRavh0NEEvtXO4VxJbmm/CNEpOaKxoUb1wP+tDU fivBSK3JRTz1QkGcqj8OIZqFBNHLl5qGswSkS+ZX3rhecU0jZCX+jayiN17We29eXOOH FD5g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=KHVcGF66BJ6131S4PRwX4pt0p5yKDCntZmdun1QTP5M=; b=I3KT8eZhkxP+icSgb9wA+F9NJ9B5SSKuvYLFfEJ3U+7N4reVVt00/LV69+PN/Twioh EI5qwtSx0JNylGt1CNX8e/xnt3QZOXgM4QN0TTBveJ+PT2//p0nnQEgB15UI8IVCRMEQ QiXMrfRWPziUG2JakC7sH+kyspJFTVGzMQ4sq0qa8URTDbQ6zH0ByDO3cj1xnKvTKVOM qnOe2u9fd7W05llV5vtRgcOmbPXn3hv0cAyrSYJYwKNVfj++Wt7jV43EP6Eqq4e4bOPh NNQlz9SbstM2k48DeQC/3rf+Anwqgk4zEKzJ+nU8S4bUd3VzIKGkDkmfxLW/0vGn70OP 7ujQ==
X-Gm-Message-State: AOAM531yaEKps1HSc0V3Aam8CI0dsMHZEiUVtcZKqkDZsnp9l3MaokVB pY2j6k76oYAfQvR887z11kQb3mKP/lxW4bhkeXQ0IQ==
X-Google-Smtp-Source: ABdhPJx8PFZFGZ5tLxeT6YigWlaxceNYqyBlTdoN2FKAzW/VOcBIJEY8PTEJJAs3PVCO46/h4A8r3grp9WeRTWnUxAM=
X-Received: by 2002:a50:c050:: with SMTP id u16mr10447493edd.177.1606519909574; Fri, 27 Nov 2020 15:31:49 -0800 (PST)
MIME-Version: 1.0
References: <CAEGSd=DY8t8Skor+b6LSopzecoUUzUZhti9s0kdooLZGxPEt+w@mail.gmail.com> <d29042a7-742b-a445-cf60-2773e5515ae5@gont.com.ar> <CALx6S37+1duoNGR3dZWesHsZvx15kX9wCWufPMh=esvMaSMF_g@mail.gmail.com> <63e7aad3-7094-7492-dbe4-3eefb5236de3@gont.com.ar> <CALx6S37t4jump6S-R5_xdo5DF+RnHtT4rU5-RuiC-2GQ0PXxkQ@mail.gmail.com> <96b6d04b-e5bb-ba79-0281-e9599109be95@gont.com.ar>
In-Reply-To: <96b6d04b-e5bb-ba79-0281-e9599109be95@gont.com.ar>
From: Tom Herbert <tom@herbertland.com>
Date: Fri, 27 Nov 2020 16:31:38 -0700
Message-ID: <CALx6S34uCrA1QdvLV8fpRKaJGLWMgtCmBCnrsBjU3TS+kXUs3Q@mail.gmail.com>
To: Fernando Gont <fernando@gont.com.ar>
Cc: Alexander Azimov <a.e.azimov@gmail.com>, tcpm <tcpm@ietf.org>, IPv6 Operations <v6ops@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/LrnjJj0CzufvQqgdNcefSSu4QF0>
Subject: Re: [v6ops] Flow Label Load Balancing
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2020 23:31:53 -0000

On Fri, Nov 27, 2020 at 1:41 PM Fernando Gont <fernando@gont.com.ar> wrote:
>
> Hello, Tom,
>
> On 25/11/20 16:35, Tom Herbert wrote:
> > Hi Fernando, comments in line...
> >
> > On Wed, Nov 25, 2020 at 12:13 AM Fernando Gont <fernando@gont.com.ar> wrote:
> >>
> >> Hi, Tom,
> >>
> >> On 24/11/20 16:43, Tom Herbert wrote:
> >> [....]
> >>> Modulating the flow label is a means to affect the routing of packets
> >>> through the network that uses flow labels as input to the ECMP hash.
> >>
> >> What's the point?
> >>
> >> 1) You cannot tell *if* the FL is being used.
> >>
> > Generally true, but in a limited domain this information could be
> > discerned. I'd note that it's also generally true that we don't know
> > if there is a load balancer or stateful firewall in the path that
> > requires consistent routing, but in a limited domain we could know
> > that also.
>
> Exactly. So my take is that the drawbacks for the general case outweigh
> the benefits of the specific case.
>
>
>
> >> 2) Changing the FL does not necessarily mean that packets will employ a
> >> different link.
> >
> > It's an opportunistic mechanism. If a connection is failing and we get
> > a better path that fixes it by simply changing the flow label then
> > what's the harm?
>
> Complexity, and the possible negative implications for the general case.
>
> That said, why don't you fix the problem where you should (i.e., routing)?
>
>
>
> >> 3) If the network is failing, shouldn't you handle this via routing?
> >>
> > Sure, but then that requires an out of band feedback loop from a TCP
> > implementation to the network infrastructure to indicate there is a
> > problem and then the network needs to respond.
>
> Not really. TCP does not deal with paths (not should it). If the network
> detects a link is failing, it should route around the problem.
>

Fernando,

All we know in this case is that *TCP* has detected a connection is
failing and there may be a potential problem with a path, we don't
know that the network has detected any problems like some link is
failing or will ever detect any problem. So in such a scenario, what
recourse does the host have to try to salvage its connections? If the
answer is that the host should always patiently wait for the network
to figure out what is happening, then that's going to be a hard sell
to application and host stack developers (remember the network is a
black box from their perspective that they really don't trust). If the
answer is that the host is supposed to inform the network of issues
with it's connections and then host and network work together to solve
then that's great; but then what is the general protocol that has been
established to facilitate that and solved the problem of how hosts and
network work together in concert to solve user's problems?

Tom

>
>
> > That's significant
> > infrastructure and higher reaction time than doing something in TCP
> > and IP. Think of modulating the flow label is an inexpensive form of
> > source routing within a limited domain that doesn't need any
> > infrastructure or heavyweight protocols or something like segment
> > routing.
>
> Downsides:
>
> * It breaks load balancing
> * It depends on the source doing something -- but you don't necessarily
> control the source -- and you cannot tell if the source implements it
> * It doesn't solve the problem for nodes that do not behave as expected
>
> My take is that if you really care about this, you're already (or
> should!) be solving the problem in a different (and more general) way.
>
>
>
> >>> The basic idea is that the flow label associated with a connection is
> >>> randomly changed when the stack observes that the connection is
> >>> failing (e.g. and an RTO). There is nothing in the specs that prevents
> >>> this since the source is at liberty to set the flow label as it sees
> >>> fit.
> >>
> >> The FL is expected to remain constant for the life of a flow. A
> >> retransmitted packet is part of the same flow as the
> >> originally-transmitted packet. So this seems to be contradicting the
> >> very specification of the FL.
> >>
> >> For instance, If a RTO for a flow causes the FL to change, then one may
> >> possibly argue that the FL is not naming/labeling what is said/expected
> >> to be anming/labeling.
> >
> > Specifically, RFC6437 states:
> >
> > "It is therefore RECOMMENDED that source hosts support the flow label
> > by setting the flow label field for all packets of a given flow to the
> > same value chosen from an approximation to a discrete uniform
> > distribution."
> >
> > So that is clearly a just recommendation, and not a requirement (and
> > definitely not a MUST). Furthermore, RFC6437 states:
> >
> > "A forwarding node MUST either leave a non-zero flow label value
> > unchanged or change it only for compelling operational security
> > reasons as described in Section 6.1."
>
> My concern is that we may leave the FL unusable.
>
>
>
>
> > So there's no guarantee in the protocol specs that flow labels are
> > consistent for the life of the connection, which means that the
> > network cannot assume that and thus it would be incorrect if the
> > network tried to enforce flow label consistency as a protocol
> > requirement.
>
> You're probably right in that respect. And, in retrospective, I think it
> would have been better if we had specified the FL to be unmutable.
>
>
>
> > As I said, it is prudent to try to be consistent with
> > flow labels and the default behavior in Linux should be changed,
>
> Sorry: is the current Linux behavior to modulate the FL?
>
>
> > however I do not believe there's a valid claim of non-conformance that
> > motivates removal of the feature that is already deployed.
>
> I personally consider the changing the FL for a flow to be a bug. That
> said, if the default setting is the correct/compliant behavior, at the
> end of the day we're all adults, and whoever overrides the default
> behavior is supposed to know what he/she does -- and if they don't, they
> get what they deserve  ;-)
>
> So, in that light, I wouldn't push to remove the optional behavior,
> *provided* the default behavior is correct.
>
> Thanks,
> --
> Fernando Gont
> e-mail: fernando@gont.com.ar || fgont@si6networks.com
> PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
>
>
>