Re: [v6ops] [tcpm] Flow Label Load Balancing

Yuchung Cheng <ycheng@google.com> Thu, 26 November 2020 16:39 UTC

Return-Path: <ycheng@google.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D0C783A14AB for <v6ops@ietfa.amsl.com>; Thu, 26 Nov 2020 08:39:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -15.6
X-Spam-Level:
X-Spam-Status: No, score=-15.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_DOTEDU=1.999, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aj1s5SlIRG1E for <v6ops@ietfa.amsl.com>; Thu, 26 Nov 2020 08:39:37 -0800 (PST)
Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 81B803A14A9 for <v6ops@ietf.org>; Thu, 26 Nov 2020 08:39:36 -0800 (PST)
Received: by mail-wm1-x330.google.com with SMTP id a186so2695198wme.1 for <v6ops@ietf.org>; Thu, 26 Nov 2020 08:39:36 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=edWfoAmVLZbYlmpbCJONOdlNJLI3bMwOAVJSH9ZaLCQ=; b=iUoi9IwLGLbG5Rpoemq2UQDon2zWNuLsqbnW3WmSzRecaosWeumRIntXHtuIjTBcsR duEZLqgZXuBCkGjGSD8BYjUjPYO87CU8nfbxKqzya4QP04zGCwq8y24OlhwQzOQh7dIL EYpnVfzIe7Jx2jh3f0HxJtU2WUKyDxJLJJxS0thyA94UfP9+FXCqvcZl4wJ/cYqI+2mO Rx02+uRRwYDbHzUMhU1ypuVKlagMuwJHIkGY/l37kRz3iM4XdtxS3NohJZnavzy1z3ub MYeDooXhpqTSmrYa/+FdOnruWCWJY95yiG9TLGyDOKrUbAXXLCk71ejAcjZ4VqSytsjn I+2w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=edWfoAmVLZbYlmpbCJONOdlNJLI3bMwOAVJSH9ZaLCQ=; b=lEIbojvHH6NS9d9HN1oqoSEq0qy5gq+lJKSlIsqdQ6ssAO9o5q161CMZxsviwQvFzk t5n2TmCbGiE5nqTrVTyR6CAbz5EgldUh0JU6P4nXR4CgJx2Fpm+KQSKlw3eskXQAdosS QFa4Vl/hH/Jzwjh0qWPrrlC7k1KrtH8TOMo1BMfaBvhJX0POBZ/e5lGcFGL9A4c/6p87 aLZqPE27sQXoDXq/ftkwOxfhxSOGpF6yM06CaP8Dx+2jt5BTalVdmM2+Yc3m+DQMfVk8 /sR8MbPxa9p7JLySVckxWH/XdFxIGztYOPlwl+52u2fy5QAbKu972nTUzc/ObL4uv5XU 9gpA==
X-Gm-Message-State: AOAM532ibAZME8D9WCkmo+y+tLcWSqMcnh3gZQDNmjhRkA/Gc/jV8JyP 37xIOsNq27x++CB7zAwlSwc2SzULB7IvErqSpAor2Q==
X-Google-Smtp-Source: ABdhPJzqXxadN+VVfaKqKqKd6ElReYkMJOQ8O5ba/SnFB+r0DbboFc/wdwFUdnF2KFMXogyMhIABcSyHr2oNVDNz2sY=
X-Received: by 2002:a7b:cd11:: with SMTP id f17mr4215376wmj.127.1606408774663; Thu, 26 Nov 2020 08:39:34 -0800 (PST)
MIME-Version: 1.0
References: <CAEGSd=DY8t8Skor+b6LSopzecoUUzUZhti9s0kdooLZGxPEt+w@mail.gmail.com> <d29042a7-742b-a445-cf60-2773e5515ae5@gont.com.ar> <CALx6S37+1duoNGR3dZWesHsZvx15kX9wCWufPMh=esvMaSMF_g@mail.gmail.com> <63e7aad3-7094-7492-dbe4-3eefb5236de3@gont.com.ar> <CALx6S37t4jump6S-R5_xdo5DF+RnHtT4rU5-RuiC-2GQ0PXxkQ@mail.gmail.com> <239c4b67-1d9a-da00-7bb0-52019be1b7c1@joelhalpern.com> <CALx6S34uSAne_LyhrWDcjkR5p7MO6ggm_Ua_h+6nkX41S=Ge=A@mail.gmail.com> <a8aad80c-1a4b-4a86-4c13-7391e8513049@joelhalpern.com> <CALx6S36xYADqNrPp1A_Ohx48d7SdV2oFOgVFVV+y_tDbGQG6ug@mail.gmail.com> <abf9c63a-2f7e-6f28-34e8-b3e9598cd2b9@gmail.com> <CALx6S36PTVT49CQHdJNx88PHyYQS23WYP3A7Xw1-+f_tt4H3Gg@mail.gmail.com> <CAEGSd=BGqFTygTiAt1v-71W3RTpdyVqyYzD1vi9uKebPPMoE5Q@mail.gmail.com>
In-Reply-To: <CAEGSd=BGqFTygTiAt1v-71W3RTpdyVqyYzD1vi9uKebPPMoE5Q@mail.gmail.com>
From: Yuchung Cheng <ycheng@google.com>
Date: Thu, 26 Nov 2020 08:38:57 -0800
Message-ID: <CAK6E8=eB-aCxehme5z9zvvx68dZKgSp=P_psh=78Dy6PBAjq6Q@mail.gmail.com>
To: Alexander Azimov <a.e.azimov@gmail.com>
Cc: Tom Herbert <tom@herbertland.com>, IPv6 Operations <v6ops@ietf.org>, tcpm <tcpm@ietf.org>, Brian E Carpenter <brian.e.carpenter@gmail.com>
Content-Type: multipart/alternative; boundary="00000000000061231205b5053001"
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/qfARpZk2vjfRG3skvVXaBjBfdBQ>
Subject: Re: [v6ops] [tcpm] Flow Label Load Balancing
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Nov 2020 16:39:40 -0000

On Thu, Nov 26, 2020 at 1:34 AM Alexander Azimov <a.e.azimov@gmail.com>
wrote:

> Dear colleagues,
>
> We started discussing an incorrect default behavior, and Tom has already
> confirmed that it will be fixed.
>
> Later the thread turned into an argument if the flow label can be changed
> during connection/flow lifetime according to current RFC documents, though
> these documents can be updated. This looks a bit weird for me because I
> always thought that it is the IETF community's responsibility to document
> proper solutions. If something undocumented but worthy happens in the
> industry - IETF should catch up. So, I would like to get back to the
> discussion of reasons and their safety.
>
> The general idea of changing the routing path upon network
> outage/degradation looks obvious. Getting kind of source-based routing can
> significantly reduce the reaction time and improve end-user experience. The
> flow label has a perfect match here: transparent for the application, set
> by the source, not part of 5-tuple, while it can be used in the load
> balancing. IMO poorly documented. I would like to learn your feedback for
> the next wording:
>
> Let say that the flow label is a hash of values from the IP packet's
> 5-tuple and random number. Then
>
> 1. In the case of SYN or SYN-ACK retransmission flow label SHOULD be
> recalculated.
> 2. In the case of RTO timeout expiration in the established TCP session
> the flow label MAY be recalculated. This setting MUST be switched off by
> default.
> 3. Otherwise flow label SHOULD be preserved unchanged.
>

For completion, we might want to discuss reverse-path failure that
RTO-triggered rehash in (1)(2) alone can not repair the connection in
established or handshake phases.

https://patchwork.ozlabs.org/project/netdev/patch/20180829215356.235336-1-ycheng@google.com/

I also agree with Tom that the general feature is very useful in keeping
TCP connected with partial route failures, but the default should be an
opt-in.


>
> The first point stands for redirecting connection from the degraded path
> before the connection is established, and it looks safe. The second one can
> also improve performance if it is set on the server-side. Please comment if
> you see security flaws in such a design.
>
> чт, 26 нояб. 2020 г. в 03:16, Tom Herbert <tom@herbertland.com>om>:
>
>> On Wed, Nov 25, 2020 at 3:33 PM Brian E Carpenter
>> <brian.e.carpenter@gmail.com> wrote:
>> >
>> > I'm not Joel, but I did once spend some time grepping RFCs to find out
>> whether
>> > "flow" or "microflow" was the preferred term. In RFC2474, which is
>> normative,
>> > we have:
>> >
>> >    Microflow: a single instance of an application-to-application flow of
>> >    packets which is identified by source address, destination address,
>> >    protocol id, and source port, destination port (where applicable).
>> >
>> > But in the flow label work we explicitly avoided being that precise, and
>> > did not use the term "microflow". There might be some load balancing
>> > scenarios where you want a broader definition, even including
>> bidirectional
>> > flows. There are expired drafts on that topic:
>> > draft-tarreau-extend-flow-label-balancing
>> > draft-wang-6man-flow-label-reflection
>> >
>> Brian,
>>
>> Random Packet Spraying
>> (
>> https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.297.529&rep=rep1&type=pdf
>> )
>> is an interesting idea where packets for a single connection are
>> purposely distributed across multiple paths for load distribution. Per
>> packet randomized flow labels with flow label aware ECMP makes this
>> quite easy to do without requiring any special support in switches
>> like you'd need with IPv4. I'm not necessarily advocating this, but it
>> does highlight one potential use case of having a flow label that
>> doesn;t have rigidly defined requirements on the host.
>>
>> Tom
>>
>> > Regards
>> >    Brian
>> >
>> > On 26-Nov-20 11:13, Tom Herbert wrote:
>> > > Joel, is there a normative definition of a flow?
>> > >
>> > > On Wed, Nov 25, 2020, 1:27 PM Joel M. Halpern <jmh@joelhalpern.com
>> <mailto:jmh@joelhalpern.com>> wrote:
>> > >
>> > >     No, as Brian says, there are escape clauses in the flow
>> definitions.
>> > >
>> > >     But changing the flow label due to traffic problems does not
>> correspond
>> > >     to packets being in actually different flows.
>> > >     If one were using UDP, and mixing loss sensitive packets with loss
>> > >     insensitive packets for a special application, sure, one could
>> use two
>> > >     flow labels.  But that is not what you are describing.
>> > >
>> > >     Yours,
>> > >     Joel
>> > >
>> > >     On 11/25/2020 3:05 PM, Tom Herbert wrote:
>> > >     > Joel,
>> > >     >
>> > >     > Is there an RFC that clearly and unambiguously states that a
>> host MUST
>> > >     > use the same flow label for the lifetime _and_ clearly defines
>> exactly
>> > >     > what a flow is with respect to such a requirement (for
>> instance, how
>> > >     > would you define a flow and enforce such a requirement in UDP?
>> IPsec?
>> > >     > other encapsulations?). If there is such a requirement then
>> we'll
>> > >     > change the code to be conformant.
>> > >     >
>> > >     > Tom
>> > >     >
>> > >     > On Wed, Nov 25, 2020 at 12:39 PM Joel M. Halpern <
>> jmh@joelhalpern.com <mailto:jmh@joelhalpern.com>> wrote:
>> > >     >>
>> > >     >> This kind of thing is why, as I understand it, MPTCP has
>> discovery
>> > >     >> mechanisms ot know if both sides use it, and can select
>> alternative
>> > >     >> addresses for communication.
>> > >     >>
>> > >     >> Trying to guess flow labels that might avoid a problem because
>> it might
>> > >     >> be an ECMP problem, is just flailing about.  Not a good design
>> for
>> > >     >> operational protocols.
>> > >     >>
>> > >     >> And in general, designing protocols around "I know exactly
>> what is going
>> > >     >> on"  (the requirement for what you describe that goes well
>> beyond just
>> > >     >> "limited domains") is also a recipe for failure.
>> > >     >>
>> > >     >> The Flow Label RFCs are actually very explicit that a flow
>> label is
>> > >     >> supposed to be stable for the life of the flow.  Otherwise, it
>> isn't a
>> > >     >> flow label.
>> > >     >>
>> > >     >> Yours,
>> > >     >> Joel
>> > >     >>
>> > >     >> On 11/25/2020 2:35 PM, Tom Herbert wrote:
>> > >     >>> Hi Fernando, comments in line...
>> > >     >>>
>> > >     >>> On Wed, Nov 25, 2020 at 12:13 AM Fernando Gont <
>> fernando@gont.com.ar <mailto:fernando@gont.com.ar>> wrote:
>> > >     >>>>
>> > >     >>>> Hi, Tom,
>> > >     >>>>
>> > >     >>>> On 24/11/20 16:43, Tom Herbert wrote:
>> > >     >>>> [....]
>> > >     >>>>> Modulating the flow label is a means to affect the routing
>> of packets
>> > >     >>>>> through the network that uses flow labels as input to the
>> ECMP hash.
>> > >     >>>>
>> > >     >>>> What's the point?
>> > >     >>>>
>> > >     >>>> 1) You cannot tell *if* the FL is being used.
>> > >     >>>>
>> > >     >>> Generally true, but in a limited domain this information
>> could be
>> > >     >>> discerned. I'd note that it's also generally true that we
>> don't know
>> > >     >>> if there is a load balancer or stateful firewall in the path
>> that
>> > >     >>> requires consistent routing, but in a limited domain we could
>> know
>> > >     >>> that also.
>> > >     >>>
>> > >     >>>> 2) Changing the FL does not necessarily mean that packets
>> will employ a
>> > >     >>>> different link.
>> > >     >>>
>> > >     >>> It's an opportunistic mechanism. If a connection is failing
>> and we get
>> > >     >>> a better path that fixes it by simply changing the flow label
>> then
>> > >     >>> what's the harm?
>> > >     >>>
>> > >     >>>>
>> > >     >>>> 3) If the network is failing, shouldn't you handle this via
>> routing?
>> > >     >>>>
>> > >     >>> Sure, but then that requires an out of band feedback loop
>> from a TCP
>> > >     >>> implementation to the network infrastructure to indicate
>> there is a
>> > >     >>> problem and then the network needs to respond. That's
>> significant
>> > >     >>> infrastructure and higher reaction time than doing something
>> in TCP
>> > >     >>> and IP. Think of modulating the flow label is an inexpensive
>> form of
>> > >     >>> source routing within a limited domain that doesn't need any
>> > >     >>> infrastructure or heavyweight protocols or something like
>> segment
>> > >     >>> routing.
>> > >     >>>
>> > >     >>>>
>> > >     >>>>
>> > >     >>>>> The basic idea is that the flow label associated with a
>> connection is
>> > >     >>>>> randomly changed when the stack observes that the
>> connection is
>> > >     >>>>> failing (e.g. and an RTO). There is nothing in the specs
>> that prevents
>> > >     >>>>> this since the source is at liberty to set the flow label
>> as it sees
>> > >     >>>>> fit.
>> > >     >>>>
>> > >     >>>> The FL is expected to remain constant for the life of a
>> flow. A
>> > >     >>>> retransmitted packet is part of the same flow as the
>> > >     >>>> originally-transmitted packet. So this seems to be
>> contradicting the
>> > >     >>>> very specification of the FL.
>> > >     >>>>
>> > >     >>>> For instance, If a RTO for a flow causes the FL to change,
>> then one may
>> > >     >>>> possibly argue that the FL is not naming/labeling what is
>> said/expected
>> > >     >>>> to be anming/labeling.
>> > >     >>>
>> > >     >>> Specifically, RFC6437 states:
>> > >     >>>
>> > >     >>> "It is therefore RECOMMENDED that source hosts support the
>> flow label
>> > >     >>> by setting the flow label field for all packets of a given
>> flow to the
>> > >     >>> same value chosen from an approximation to a discrete uniform
>> > >     >>> distribution."
>> > >     >>>
>> > >     >>> So that is clearly a just recommendation, and not a
>> requirement (and
>> > >     >>> definitely not a MUST). Furthermore, RFC6437 states:
>> > >     >>>
>> > >     >>> "A forwarding node MUST either leave a non-zero flow label
>> value
>> > >     >>> unchanged or change it only for compelling operational
>> security
>> > >     >>> reasons as described in Section 6.1."
>> > >     >>>
>> > >     >>> So there's no guarantee in the protocol specs that flow
>> labels are
>> > >     >>> consistent for the life of the connection, which means that
>> the
>> > >     >>> network cannot assume that and thus it would be incorrect if
>> the
>> > >     >>> network tried to enforce flow label consistency as a protocol
>> > >     >>> requirement. As I said, it is prudent to try to be consistent
>> with
>> > >     >>> flow labels and the default behavior in Linux should be
>> changed,
>> > >     >>> however I do not believe there's a valid claim of
>> non-conformance that
>> > >     >>> motivates removal of the feature that is already deployed.
>> > >     >>>
>> > >     >>> Tom
>> > >     >>>
>> > >     >>>
>> > >     >>>
>> > >     >>>
>> > >     >>>>
>> > >     >>>>
>> > >     >>>>
>> > >     >>>>> The feature is useful in large datacenter networks, like
>> > >     >>>>> pparently Facebook where the patches originate, since
>> information
>> > >     >>>>> discerned by TCP can opportunistically be applied to route
>> selection.
>> > >     >>>>> The practical issue is that there are stateful devices like
>> firewalls
>> > >     >>>>> that require consistent routing in the network in which
>> case changing
>> > >     >>>>> the flow label can confuse them. As I mentioned, the
>> original intent
>> > >     >>>>> was that the flow label randomization feature should be
>> opt-in instead
>> > >     >>>>> of on by default.
>> > >     >>>>
>> > >     >>>> So... where is the "source" of the packet that would be
>> "modulating" the FL?
>> > >     >>>>
>> > >     >>>> Thanks,
>> > >     >>>> --
>> > >     >>>> Fernando Gont
>> > >     >>>> e-mail: fernando@gont.com.ar <mailto:fernando@gont.com.ar>
>> || fgont@si6networks.com <mailto:fgont@si6networks.com>
>> > >     >>>> PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF
>> D076 FFF1
>> > >     >>>>
>> > >     >>>>
>> > >     >>>>
>> > >     >>>
>> > >     >>> _______________________________________________
>> > >     >>> v6ops mailing list
>> > >     >>> v6ops@ietf.org <mailto:v6ops@ietf.org>
>> > >     >>> https://www.ietf.org/mailman/listinfo/v6ops
>> > >     >>>
>> > >
>> > >
>> > > _______________________________________________
>> > > v6ops mailing list
>> > > v6ops@ietf.org
>> > > https://www.ietf.org/mailman/listinfo/v6ops
>> > >
>> >
>>
>> _______________________________________________
>> v6ops mailing list
>> v6ops@ietf.org
>> https://www.ietf.org/mailman/listinfo/v6ops
>>
>
>
> --
> Best regards,
> Alexander Azimov
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm
>