Re: [v6ops] Flow Label Load Balancing

Tom Herbert <> Thu, 19 November 2020 15:36 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 142ED3A0B56 for <>; Thu, 19 Nov 2020 07:36:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id NxT3p5h3FzOv for <>; Thu, 19 Nov 2020 07:36:42 -0800 (PST)
Received: from ( [IPv6:2a00:1450:4864:20::630]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 587FE3A0B49 for <>; Thu, 19 Nov 2020 07:36:42 -0800 (PST)
Received: by with SMTP id o9so8558196ejg.1 for <>; Thu, 19 Nov 2020 07:36:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=hJdKs49Yd3P35X76Yj6DwlhGajrRbKeAN5aH77qCz5o=; b=YGRcf8fbjwVmFVrc7uB27kO0NfhCMIzOwERqI7bHFT7+PSEFSdHGamw/Wk4ZA2P3qw yL0QLgSN32DLod8V9IJu5XooktLdqWKYN00o78UklXc5RhidIh0XUotbkww4ZfGfHiPN Pmp6pr7DireofDiH/7xvZr/cHT+ufDPdesON548i4cqi6ihLHVOC+zcUQ0JS2lfPhy59 bxxHNS/XF1WLA35kKTYHRoMp8KkT0HD5AK6A7R+xndpbtZdiNKetjcC23SwbC7s02uoo H6lQXeFyK23EO80vyc6RNVvi7SFQKTxkNNVu3ohzeN98zy/erSjYg7R2QSTl8nDeSYYl 17Gw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=hJdKs49Yd3P35X76Yj6DwlhGajrRbKeAN5aH77qCz5o=; b=iRDernM7JkoKr+Z6L9FNgL3adykoU4klP4fxd6xAUn24649wZ6g6c+bxyzEjWSgbJ3 FRCdFG3VD75C80QexdEaV3NU3/I2BFqGRZwuHhIBrt2mCvB9lEhmDmZbmFbmYeyd1i0f gLYc/i/XGMP25FymW+g4Xyi7fK3YVY1dKPwWHSSivl1nj9lfXRj04oxnkR0vrr+yz49Y 23D8NLZ9E477E3tpEiPJmOUbpblTmlXwZSaIk55vvR57dbH+qnMJxQrSBUjIZScVSKB6 fhKge8oWIuA6yGHN+FdQCd7J+19LtknFMkraFbaYmXY3JenuiRcplhv7+43JsFz7zlcX fMjA==
X-Gm-Message-State: AOAM531yuo2uUvlrTEINto38Vy/ovB0oBu2452ysXqctA8MYAEDRmc5K VDrWtfL9ggkJL1SdEndZMuxWwGtp5GdeFI3LzR1YfQ==
X-Google-Smtp-Source: ABdhPJx9T5g80W7SOLsW5YUhmbrZ3PgbRI/T5I/UTpvq7C295GaULVcLX+AO3ZYo9Nwp5yZpS0fzMkX9uRhTj7NXLYA=
X-Received: by 2002:a17:906:8058:: with SMTP id x24mr29683123ejw.272.1605800200587; Thu, 19 Nov 2020 07:36:40 -0800 (PST)
MIME-Version: 1.0
References: <>
In-Reply-To: <>
From: Tom Herbert <>
Date: Thu, 19 Nov 2020 08:36:29 -0700
Message-ID: <>
To: Alexander Azimov <>
Cc: tcpm <>, IPv6 Operations <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [v6ops] Flow Label Load Balancing
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 19 Nov 2020 15:36:44 -0000

On Thu, Nov 19, 2020 at 3:49 AM Alexander Azimov <> wrote:
> Dear colleagues,
> I have added in the cc both v6op and tcpm for a reason and let me explain why.
> It's clear that we are moving forward with load balancing that uses flow label (FL). And the pressure will increase with SRv6 adoption. But at the moment wide adoption of FL-based load-balancing may create significant issues for TCP Anycast services.
> RFC6437 suggests putting hash from 5-tuple into FL value. And as far as I know, there is no document that updates this behavior. This description is perfectly fine, but what is implemented in the Linux kernel is different: FL is carrying hash from 5-tuple with an additional seed, and this seed is randomly changed after each RTO/SYN_RTO event. Here are related patches:

The initial intent of the Linux flow label code was that the hash was
persistent for the lifetime of the connection; in fact we fixed a
nasty bug where the hash would change when a connection moved to TW
state causing some firewalls not to see the final ACK so they can
remove state. The feature where the flow hash is recalculated for a
failing connection should be optional and not the default behavior. If
this isn't what the implementation is doing I can take it up on Linux
netdev to fix it.

Have you been able to show this behavior is actually happening on a
running system?


> This is a great thing by the way because in the data center environment with multiple equal paths it gives a way to have pseudo-multipath TCP which jumps between paths in case of an outage. There might be interest to write down an informational document for this.
> But the problem happens when TCP flows start jumping between anycast points. If anycast service provides connection tracking (L7/TCP proxy, stateful L4 balancer) the 'jump' may redirect traffic to another point, and with the adoption of FL load balancing by the transit providers - to another location, where no TCP state is available. In other words, an established TCP session breaks after RTO event. And it's not just a theory - we already faced this kind of issue in our network, and it is really hard to debug.
> I wonder what you think is a proper solution:
> Making FL related RTO change as knob instead of default behavior;
> Adding negotiation behavior in TCP;
> Something else?
> I'm looking forward to your advice. If there is a document that describes the above problem - please give me a reference.
> --
> Best regards,
> Alexander Azimov
> _______________________________________________
> v6ops mailing list