Re: [v6ops] [tcpm] Flow Label Load Balancing

Tom Herbert <> Sat, 28 November 2020 18:27 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id D80973A0E85 for <>; Sat, 28 Nov 2020 10:27:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id KwRN3cNgvd0D for <>; Sat, 28 Nov 2020 10:27:46 -0800 (PST)
Received: from ( [IPv6:2a00:1450:4864:20::643]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 6B5663A0E0A for <>; Sat, 28 Nov 2020 10:27:46 -0800 (PST)
Received: by with SMTP id qw4so2322808ejb.12 for <>; Sat, 28 Nov 2020 10:27:46 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=wWl/NMyNYxy387O+iIu6O4b8CdadQrZa/jeMx3Om5tg=; b=dNgTTawMw7VpBXjsMXb8333PuiTIN6tthkD5Ce1+L1hE+f5KpmZwout31RmBskI+1o b0ZTql8dcOy1NJ/B1mZwKFgGlxCxSvikm0sOY+/vH6j8EhZUoHXNI2Ebc7e0hp2BG6z/ y91bbKqEkVXzRxR4JLSuLFnYJUBZeHZ2Y7925uVMA+ZAttlrbCXr9AlTXcIeSdDvaQVc aWmEo3l4sgMgugqf/VB89gX24ZpKHjo2u8R0St0XsGwz9cuhAGlfFXbhcHXFblYxZFBd XeMmsFsEQNn1cP2JrZG7i8SRFEzqb0ul9AUXmqlgLewRl9NHhAoLS3xiXOb9NODVC8Wm ICEA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=wWl/NMyNYxy387O+iIu6O4b8CdadQrZa/jeMx3Om5tg=; b=HWU7NdGLBqXCsbMYZAnVX9oXI/5kLUP73RgBeUGq2OUfbEWGqVXcEA8R69I73s04FD 2mZt+kSm06rVbf0EsFbpiLjMSIbQYDNNKi4+PlcxnJU6pB1U7Gj7pKYOVKnUBWoFN4HX DMVcWJLGk2CKwwdp7J+O2oIWhzDRTCQog4UszbVsVsakbvtx6ZXXQmbonkR3igtD7D6Y 2T7x6GYYcQZEPhXu1R33Wo7bnJ9MuNGcN5IvZRbd0wCVtLVafKrz699RVJ+EfVsRbIi5 2Jb1Qb9LzJtOgGKBXnlnoTLAL9R8LCiUDOjxK6G/0/YaqvD1/iySq5qdYBmb7bhpICso 8TOw==
X-Gm-Message-State: AOAM533CYKRCj2nmJTwWskdwV+0/0N6mWSEG1Y7DBCJTYbljMJKxd09w FblbrwD2I0avT2ZMGsT4TpR40Rhe6DSzCpKttNXh+Q==
X-Google-Smtp-Source: ABdhPJzZZSeTYj48bvJ9nSIQ7zcFM6+A+2Wb+/H6bNI7joGqmXwJYywQ0bf2CQxfAQ178s1cPq0FN2S0iiRn/6dTPlg=
X-Received: by 2002:a17:906:f153:: with SMTP id gw19mr1182793ejb.272.1606588064811; Sat, 28 Nov 2020 10:27:44 -0800 (PST)
MIME-Version: 1.0
References: <> <> <> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
From: Tom Herbert <>
Date: Sat, 28 Nov 2020 11:27:33 -0700
Message-ID: <>
To: Alexander Azimov <>
Cc: Fernando Gont <>, Mark Smith <>, IPv6 Operations <>, tcpm <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [v6ops] [tcpm] Flow Label Load Balancing
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 28 Nov 2020 18:27:51 -0000

On Sat, Nov 28, 2020 at 3:15 AM Alexander Azimov <> wrote:
> Gentlemen,
>> At the very least, one should analyze the case where in #1, a packet
>> gets delayed, you change the FL and retransmit, and finally you get a
>> response to the original packet (and/or to both packets).
> As the suggestion strictly speaks about changing FL upon RTO timeouts I don't see how it differs from getting two responses in case of RTO without FL change. Nothing special here.
>> If you have two links, then, upon change of the FL, you have 50% change
>> that the path will be different -- not good.  What would you do if the
>> failure continues? -- keep changing the FL?
> Yes, and if only one path is affected - you have a chance of hitting a good path with 1 - (1/2)^n. And it's much better than nothing.
>> > If a retransmitted SYN goes to another server/anycast point this won't bring much harm. At least we haven't experienced such problems in our datacenters.
>> I think depends on details that are definitively out of the IETF's scope. When exactly does the load balancing system assign a session to a given server, how much state does that create, what resources does it reserve, and when are the state and the reserved resources released?
>> Also, in some cases, I could imagine that the data centre will not see a problem, but the client might see one that they would not otherwise have seen.
> I can't say that these details look convincing to me. I also can't agree that operational considerations are out of IETF's scope. The protocols are designed with real operations in mind, aren't they?
>> Steering packets is the network's job. People like me who run the network expect to be in charge of that. We have fancy mechanisms that mean alternate paths are switched to very quickly.
> While not questioning innovative technologies that take care of traffic, a 'good' repair time counts in seconds. 'Bad' repair time, when the control plane becomes broken or a famous silent drop happens counts in minutes and requires a network engineer to be summoned. If TCP can fix itself in milliseconds - it's great. And it is needed by real operations. You also missing part when something happens outside of your domain, where you can't instantly summon an engineer. If you tell me that you've been never affected by the network outage of your peering partner I will be greatly suprised.

I would also point out this feature has been in deployment for at
least five years in at least two hyperscalers-- there has been no
major meltdown. We'll make the feature opt-in because we may be
exposed to non-conformant devices in the network that make incorrect
assumptions, but I have yet to see any convincing argument that we
should remove the feature becauses it's fundamentally harmful or not
providing benefits.

As for the mentality of "the network will fix the problem", that's
obviously an easy statement to make from a router vendor point of
view, but that luxury simply doesn't match the realities of how host
stacks are implemented deployed at global scale. This starts with the
fact that there is no such thing as "the network"; Linux stack for
instance needs to run on every network on Earth and I have no idea
which of these networks provides adequate security, fault detection
and repair, are well managed, or even have devices that conform to the
standards (non-conformant network devices are common). The only
recourse we, i.e. host side developers, have is to stick with the
least common denominator in the general case, and allow optional, but
still protocol conformant, features in limited use cases such as
limited domains (hence why the flow label modulation feature should be
optional as we've already established in this thread).


> сб, 28 нояб. 2020 г. в 11:42, Fernando Gont <>ar>:
>> On 28/11/20 03:42, Mark Smith wrote:
>> >
>> >
>> > On Sat, 28 Nov 2020, 14:52 Fernando Gont, <
>> > <>> wrote:
>> >
>> >     On 27/11/20 21:32, Mark Smith wrote:
>> >      >
>> >      > On Sat, 28 Nov 2020, 10:32 Tom Herbert, <
>> >     <>
>> >      > < <>>> wrote:
>> >     [....]
>> >      >
>> >      > If hosts start trying to take control of steering traffic by
>> >     varying the
>> >      > flow label value within a flow, making the network's job harder,
>> >     and the
>> >      > network being blamed for the possibility bad network performance
>> >     that
>> >      > may result, the response will be both quick and easy - network
>> >     operators
>> >      > will switch off looking the Flow Label.
>> >
>> >     THat is *exactly* why I noted that I'm concerned that we seem to be on
>> >     the path of making the FLow Label unusable.
>> >
>> >
>> > I see the Flow Label as a hint, but only a hint, of what the hosts wants
>> > from the network. It's useful as an expression of what the host would
>> > like to have.
>> As you correctly noted, it's supposed to be an identifier of the flow,
>> because it's hard to get for the port numbers.
>> I don't think it can bee seen as ad expression of what the host wants to
>> ahve, because the Flow ID has no semantics, and the host doesn't even
>> know what's in the network.
>> > However, if that hint causes me any trouble, I'll ignore it and then
>> > devolve to trying to deliver the packet based just on its destination
>> > address.
>> If you *needed* (or could make use of) the FL, then, if it causes
>> trouble, you either resort to finding the five-tuple, or quite likely
>> end of dropping packets.
>> Thanks,
>> --
>> Fernando Gont
>> e-mail: ||
>> PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
>> _______________________________________________
>> tcpm mailing list
> --
> Best regards,
> Alexander Azimov