Re: [v6ops] Flow Label Load Balancing

Alexander Azimov <a.e.azimov@gmail.com> Thu, 26 November 2020 09:34 UTC

Return-Path: <a.e.azimov@gmail.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A3CAF3A0B6E; Thu, 26 Nov 2020 01:34:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.098
X-Spam-Level:
X-Spam-Status: No, score=-0.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_DOTEDU=1.999] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2iXUn606boMO; Thu, 26 Nov 2020 01:34:16 -0800 (PST)
Received: from mail-ot1-x332.google.com (mail-ot1-x332.google.com [IPv6:2607:f8b0:4864:20::332]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3ACB83A0B74; Thu, 26 Nov 2020 01:34:16 -0800 (PST)
Received: by mail-ot1-x332.google.com with SMTP id k3so1322159otp.12; Thu, 26 Nov 2020 01:34:16 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6i2UIRIcCmDOKlC10XLkrbtGGakS9jR6ru7+roBSLN4=; b=hK13hU8dJhykLpH5ROHQXEEpRkDgF63YawuhUBPgwRHCys1x9skeXGlbx7o/ur7vyv lZY5YJJNWfQ7PfHIMYSDSEKhlmWka8SLXAZj8hhzt6F06cnc7uOwqeOk1G9QZgoZ5MRB cOwNng3XoVeKX+Hould2J+86t8lwAN+94JbfV85GMz+ApfOruRxEBMbkS7KOzbOh7fCz t1W7cQiqLSEVKchUF9vR8T7ABD9PmCwfMBY5jStEBIFFnDxCkrj5RtC3/vlYa/HHAvE1 XN3b30+PGjcxrGN1TYS2SgT0lMGtxReUVWsp8UpdDWY79DdLwRiHxZucIEL3SvgFaOWV Ba6Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6i2UIRIcCmDOKlC10XLkrbtGGakS9jR6ru7+roBSLN4=; b=TL/UiqjYtof0xjU93NFQSqcZW7/RAvh3MgHn/ddAMa6ScpCDriJurbKVrOjRPeNQid qouiwzzLz8cd150wOehvrdNBg82Mgn8IuKiuHfs5rEGRRPNyEAyZyjl2Zbs9ROrKrr5T qQeLMkS3sSC7uZAlsihBisUtTHRFJJ75Isksu28qcQUL8pNi2BFlX4iuRFCOAw5yIBv0 +2PiwyXowB81eZQWRloImUhxod+G0iaUpiQixK8cQslLIRNiMr06L4UOY/irIEFy55e5 UD5Rt+sBciuipxKdaEw7JV8KJSJd2C3QHCiyAwR//FU53i1Y9Dy9Czai6NlwTs/tdhzP oWsw==
X-Gm-Message-State: AOAM533wZYaB5j9fyko/JZ7RfO6lVdR+UibVR+f4cYgpwbYxr60JCXTF yoXlW40BDkwcn9Qi8qZ2oBnt2QDOZKEwr1eCrLyYSanRlMs=
X-Google-Smtp-Source: ABdhPJy27wvl/Nc0VfP/i7FiZJzgBFRmkpFp22fUaewtPjtM73MoAi6fEbfN1c38zLmyiihRwohwYVFpNjQb4JnP4T8=
X-Received: by 2002:a9d:4c92:: with SMTP id m18mr1814874otf.248.1606383255507; Thu, 26 Nov 2020 01:34:15 -0800 (PST)
MIME-Version: 1.0
References: <CAEGSd=DY8t8Skor+b6LSopzecoUUzUZhti9s0kdooLZGxPEt+w@mail.gmail.com> <d29042a7-742b-a445-cf60-2773e5515ae5@gont.com.ar> <CALx6S37+1duoNGR3dZWesHsZvx15kX9wCWufPMh=esvMaSMF_g@mail.gmail.com> <63e7aad3-7094-7492-dbe4-3eefb5236de3@gont.com.ar> <CALx6S37t4jump6S-R5_xdo5DF+RnHtT4rU5-RuiC-2GQ0PXxkQ@mail.gmail.com> <239c4b67-1d9a-da00-7bb0-52019be1b7c1@joelhalpern.com> <CALx6S34uSAne_LyhrWDcjkR5p7MO6ggm_Ua_h+6nkX41S=Ge=A@mail.gmail.com> <a8aad80c-1a4b-4a86-4c13-7391e8513049@joelhalpern.com> <CALx6S36xYADqNrPp1A_Ohx48d7SdV2oFOgVFVV+y_tDbGQG6ug@mail.gmail.com> <abf9c63a-2f7e-6f28-34e8-b3e9598cd2b9@gmail.com> <CALx6S36PTVT49CQHdJNx88PHyYQS23WYP3A7Xw1-+f_tt4H3Gg@mail.gmail.com>
In-Reply-To: <CALx6S36PTVT49CQHdJNx88PHyYQS23WYP3A7Xw1-+f_tt4H3Gg@mail.gmail.com>
From: Alexander Azimov <a.e.azimov@gmail.com>
Date: Thu, 26 Nov 2020 12:34:04 +0300
Message-ID: <CAEGSd=BGqFTygTiAt1v-71W3RTpdyVqyYzD1vi9uKebPPMoE5Q@mail.gmail.com>
To: Tom Herbert <tom@herbertland.com>
Cc: Brian E Carpenter <brian.e.carpenter@gmail.com>, tcpm <tcpm@ietf.org>, IPv6 Operations <v6ops@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000005157af05b4ff3f8f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/sUrSrp2mPGQBO_GMAz8sQFA5OKo>
Subject: Re: [v6ops] Flow Label Load Balancing
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Nov 2020 09:34:19 -0000

Dear colleagues,

We started discussing an incorrect default behavior, and Tom has already
confirmed that it will be fixed.

Later the thread turned into an argument if the flow label can be changed
during connection/flow lifetime according to current RFC documents, though
these documents can be updated. This looks a bit weird for me because I
always thought that it is the IETF community's responsibility to document
proper solutions. If something undocumented but worthy happens in the
industry - IETF should catch up. So, I would like to get back to the
discussion of reasons and their safety.

The general idea of changing the routing path upon network
outage/degradation looks obvious. Getting kind of source-based routing can
significantly reduce the reaction time and improve end-user experience. The
flow label has a perfect match here: transparent for the application, set
by the source, not part of 5-tuple, while it can be used in the load
balancing. IMO poorly documented. I would like to learn your feedback for
the next wording:

Let say that the flow label is a hash of values from the IP packet's
5-tuple and random number. Then

1. In the case of SYN or SYN-ACK retransmission flow label SHOULD be
recalculated.
2. In the case of RTO timeout expiration in the established TCP session the
flow label MAY be recalculated. This setting MUST be switched off by
default.
3. Otherwise flow label SHOULD be preserved unchanged.

The first point stands for redirecting connection from the degraded path
before the connection is established, and it looks safe. The second one can
also improve performance if it is set on the server-side. Please comment if
you see security flaws in such a design.

чт, 26 нояб. 2020 г. в 03:16, Tom Herbert <tom@herbertland.com>:

> On Wed, Nov 25, 2020 at 3:33 PM Brian E Carpenter
> <brian.e.carpenter@gmail.com> wrote:
> >
> > I'm not Joel, but I did once spend some time grepping RFCs to find out
> whether
> > "flow" or "microflow" was the preferred term. In RFC2474, which is
> normative,
> > we have:
> >
> >    Microflow: a single instance of an application-to-application flow of
> >    packets which is identified by source address, destination address,
> >    protocol id, and source port, destination port (where applicable).
> >
> > But in the flow label work we explicitly avoided being that precise, and
> > did not use the term "microflow". There might be some load balancing
> > scenarios where you want a broader definition, even including
> bidirectional
> > flows. There are expired drafts on that topic:
> > draft-tarreau-extend-flow-label-balancing
> > draft-wang-6man-flow-label-reflection
> >
> Brian,
>
> Random Packet Spraying
> (
> https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.297.529&rep=rep1&type=pdf
> )
> is an interesting idea where packets for a single connection are
> purposely distributed across multiple paths for load distribution. Per
> packet randomized flow labels with flow label aware ECMP makes this
> quite easy to do without requiring any special support in switches
> like you'd need with IPv4. I'm not necessarily advocating this, but it
> does highlight one potential use case of having a flow label that
> doesn;t have rigidly defined requirements on the host.
>
> Tom
>
> > Regards
> >    Brian
> >
> > On 26-Nov-20 11:13, Tom Herbert wrote:
> > > Joel, is there a normative definition of a flow?
> > >
> > > On Wed, Nov 25, 2020, 1:27 PM Joel M. Halpern <jmh@joelhalpern.com
> <mailto:jmh@joelhalpern.com>> wrote:
> > >
> > >     No, as Brian says, there are escape clauses in the flow
> definitions.
> > >
> > >     But changing the flow label due to traffic problems does not
> correspond
> > >     to packets being in actually different flows.
> > >     If one were using UDP, and mixing loss sensitive packets with loss
> > >     insensitive packets for a special application, sure, one could use
> two
> > >     flow labels.  But that is not what you are describing.
> > >
> > >     Yours,
> > >     Joel
> > >
> > >     On 11/25/2020 3:05 PM, Tom Herbert wrote:
> > >     > Joel,
> > >     >
> > >     > Is there an RFC that clearly and unambiguously states that a
> host MUST
> > >     > use the same flow label for the lifetime _and_ clearly defines
> exactly
> > >     > what a flow is with respect to such a requirement (for instance,
> how
> > >     > would you define a flow and enforce such a requirement in UDP?
> IPsec?
> > >     > other encapsulations?). If there is such a requirement then we'll
> > >     > change the code to be conformant.
> > >     >
> > >     > Tom
> > >     >
> > >     > On Wed, Nov 25, 2020 at 12:39 PM Joel M. Halpern <
> jmh@joelhalpern.com <mailto:jmh@joelhalpern.com>> wrote:
> > >     >>
> > >     >> This kind of thing is why, as I understand it, MPTCP has
> discovery
> > >     >> mechanisms ot know if both sides use it, and can select
> alternative
> > >     >> addresses for communication.
> > >     >>
> > >     >> Trying to guess flow labels that might avoid a problem because
> it might
> > >     >> be an ECMP problem, is just flailing about.  Not a good design
> for
> > >     >> operational protocols.
> > >     >>
> > >     >> And in general, designing protocols around "I know exactly what
> is going
> > >     >> on"  (the requirement for what you describe that goes well
> beyond just
> > >     >> "limited domains") is also a recipe for failure.
> > >     >>
> > >     >> The Flow Label RFCs are actually very explicit that a flow
> label is
> > >     >> supposed to be stable for the life of the flow.  Otherwise, it
> isn't a
> > >     >> flow label.
> > >     >>
> > >     >> Yours,
> > >     >> Joel
> > >     >>
> > >     >> On 11/25/2020 2:35 PM, Tom Herbert wrote:
> > >     >>> Hi Fernando, comments in line...
> > >     >>>
> > >     >>> On Wed, Nov 25, 2020 at 12:13 AM Fernando Gont <
> fernando@gont.com.ar <mailto:fernando@gont.com.ar>> wrote:
> > >     >>>>
> > >     >>>> Hi, Tom,
> > >     >>>>
> > >     >>>> On 24/11/20 16:43, Tom Herbert wrote:
> > >     >>>> [....]
> > >     >>>>> Modulating the flow label is a means to affect the routing
> of packets
> > >     >>>>> through the network that uses flow labels as input to the
> ECMP hash.
> > >     >>>>
> > >     >>>> What's the point?
> > >     >>>>
> > >     >>>> 1) You cannot tell *if* the FL is being used.
> > >     >>>>
> > >     >>> Generally true, but in a limited domain this information could
> be
> > >     >>> discerned. I'd note that it's also generally true that we
> don't know
> > >     >>> if there is a load balancer or stateful firewall in the path
> that
> > >     >>> requires consistent routing, but in a limited domain we could
> know
> > >     >>> that also.
> > >     >>>
> > >     >>>> 2) Changing the FL does not necessarily mean that packets
> will employ a
> > >     >>>> different link.
> > >     >>>
> > >     >>> It's an opportunistic mechanism. If a connection is failing
> and we get
> > >     >>> a better path that fixes it by simply changing the flow label
> then
> > >     >>> what's the harm?
> > >     >>>
> > >     >>>>
> > >     >>>> 3) If the network is failing, shouldn't you handle this via
> routing?
> > >     >>>>
> > >     >>> Sure, but then that requires an out of band feedback loop from
> a TCP
> > >     >>> implementation to the network infrastructure to indicate there
> is a
> > >     >>> problem and then the network needs to respond. That's
> significant
> > >     >>> infrastructure and higher reaction time than doing something
> in TCP
> > >     >>> and IP. Think of modulating the flow label is an inexpensive
> form of
> > >     >>> source routing within a limited domain that doesn't need any
> > >     >>> infrastructure or heavyweight protocols or something like
> segment
> > >     >>> routing.
> > >     >>>
> > >     >>>>
> > >     >>>>
> > >     >>>>> The basic idea is that the flow label associated with a
> connection is
> > >     >>>>> randomly changed when the stack observes that the connection
> is
> > >     >>>>> failing (e.g. and an RTO). There is nothing in the specs
> that prevents
> > >     >>>>> this since the source is at liberty to set the flow label as
> it sees
> > >     >>>>> fit.
> > >     >>>>
> > >     >>>> The FL is expected to remain constant for the life of a flow.
> A
> > >     >>>> retransmitted packet is part of the same flow as the
> > >     >>>> originally-transmitted packet. So this seems to be
> contradicting the
> > >     >>>> very specification of the FL.
> > >     >>>>
> > >     >>>> For instance, If a RTO for a flow causes the FL to change,
> then one may
> > >     >>>> possibly argue that the FL is not naming/labeling what is
> said/expected
> > >     >>>> to be anming/labeling.
> > >     >>>
> > >     >>> Specifically, RFC6437 states:
> > >     >>>
> > >     >>> "It is therefore RECOMMENDED that source hosts support the
> flow label
> > >     >>> by setting the flow label field for all packets of a given
> flow to the
> > >     >>> same value chosen from an approximation to a discrete uniform
> > >     >>> distribution."
> > >     >>>
> > >     >>> So that is clearly a just recommendation, and not a
> requirement (and
> > >     >>> definitely not a MUST). Furthermore, RFC6437 states:
> > >     >>>
> > >     >>> "A forwarding node MUST either leave a non-zero flow label
> value
> > >     >>> unchanged or change it only for compelling operational security
> > >     >>> reasons as described in Section 6.1."
> > >     >>>
> > >     >>> So there's no guarantee in the protocol specs that flow labels
> are
> > >     >>> consistent for the life of the connection, which means that the
> > >     >>> network cannot assume that and thus it would be incorrect if
> the
> > >     >>> network tried to enforce flow label consistency as a protocol
> > >     >>> requirement. As I said, it is prudent to try to be consistent
> with
> > >     >>> flow labels and the default behavior in Linux should be
> changed,
> > >     >>> however I do not believe there's a valid claim of
> non-conformance that
> > >     >>> motivates removal of the feature that is already deployed.
> > >     >>>
> > >     >>> Tom
> > >     >>>
> > >     >>>
> > >     >>>
> > >     >>>
> > >     >>>>
> > >     >>>>
> > >     >>>>
> > >     >>>>> The feature is useful in large datacenter networks, like
> > >     >>>>> pparently Facebook where the patches originate, since
> information
> > >     >>>>> discerned by TCP can opportunistically be applied to route
> selection.
> > >     >>>>> The practical issue is that there are stateful devices like
> firewalls
> > >     >>>>> that require consistent routing in the network in which case
> changing
> > >     >>>>> the flow label can confuse them. As I mentioned, the
> original intent
> > >     >>>>> was that the flow label randomization feature should be
> opt-in instead
> > >     >>>>> of on by default.
> > >     >>>>
> > >     >>>> So... where is the "source" of the packet that would be
> "modulating" the FL?
> > >     >>>>
> > >     >>>> Thanks,
> > >     >>>> --
> > >     >>>> Fernando Gont
> > >     >>>> e-mail: fernando@gont.com.ar <mailto:fernando@gont.com.ar>
> || fgont@si6networks.com <mailto:fgont@si6networks.com>
> > >     >>>> PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076
> FFF1
> > >     >>>>
> > >     >>>>
> > >     >>>>
> > >     >>>
> > >     >>> _______________________________________________
> > >     >>> v6ops mailing list
> > >     >>> v6ops@ietf.org <mailto:v6ops@ietf.org>
> > >     >>> https://www.ietf.org/mailman/listinfo/v6ops
> > >     >>>
> > >
> > >
> > > _______________________________________________
> > > v6ops mailing list
> > > v6ops@ietf.org
> > > https://www.ietf.org/mailman/listinfo/v6ops
> > >
> >
>
> _______________________________________________
> v6ops mailing list
> v6ops@ietf.org
> https://www.ietf.org/mailman/listinfo/v6ops
>


-- 
Best regards,
Alexander Azimov