Re: [tcpm] [v6ops] Flow Label Load Balancing

Tom Herbert <tom@herbertland.com> Thu, 19 November 2020 15:36 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 068BB3A0B49 for <tcpm@ietfa.amsl.com>; Thu, 19 Nov 2020 07:36:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LTne5N03sC3x for <tcpm@ietfa.amsl.com>; Thu, 19 Nov 2020 07:36:42 -0800 (PST)
Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 62A5C3A0B4E for <tcpm@ietf.org>; Thu, 19 Nov 2020 07:36:42 -0800 (PST)
Received: by mail-ej1-x62e.google.com with SMTP id a16so8535463ejj.5 for <tcpm@ietf.org>; Thu, 19 Nov 2020 07:36:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=hJdKs49Yd3P35X76Yj6DwlhGajrRbKeAN5aH77qCz5o=; b=YGRcf8fbjwVmFVrc7uB27kO0NfhCMIzOwERqI7bHFT7+PSEFSdHGamw/Wk4ZA2P3qw yL0QLgSN32DLod8V9IJu5XooktLdqWKYN00o78UklXc5RhidIh0XUotbkww4ZfGfHiPN Pmp6pr7DireofDiH/7xvZr/cHT+ufDPdesON548i4cqi6ihLHVOC+zcUQ0JS2lfPhy59 bxxHNS/XF1WLA35kKTYHRoMp8KkT0HD5AK6A7R+xndpbtZdiNKetjcC23SwbC7s02uoo H6lQXeFyK23EO80vyc6RNVvi7SFQKTxkNNVu3ohzeN98zy/erSjYg7R2QSTl8nDeSYYl 17Gw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=hJdKs49Yd3P35X76Yj6DwlhGajrRbKeAN5aH77qCz5o=; b=Ts5sVdgfjHMGsOa+L2GrAkcq1hwdD6rN/nt3LGQ0mHQ0to7iXPrtiUCr36davy4LrA ZwWjPl/6kR2bz636bwnA2i3AkoXFvYWmua7ontGMcRObpODlrZgqFPefJWiTE74kDRuO thIGRPJeLVEe4wJT7SX3qvgQcL/K6ebXTRgQNTyFO+wwfcQrlUPZun5j1+kuQhVJlxO8 NFTYCnq8AC0GRuLO0Wydt1KvOCwnmZFtfOPLjkgd1nLc3HpOFJgcqXfK/iUJczNLDyHz /627WQMjOaV+x4bj9XXx0tIehl/otOMlPj8KoBYExwRfYEg0e7UiF/RQPcI8kOZfNN0b PhCg==
X-Gm-Message-State: AOAM531gNY5Cak25+E404UsJn0pzGevuqrDRhxg8UaZ9GHezIYAMLxuY l7dE46l0o8Zi4x6qRbSFVy7aruL5JrnfoodvmXXryg==
X-Google-Smtp-Source: ABdhPJx9T5g80W7SOLsW5YUhmbrZ3PgbRI/T5I/UTpvq7C295GaULVcLX+AO3ZYo9Nwp5yZpS0fzMkX9uRhTj7NXLYA=
X-Received: by 2002:a17:906:8058:: with SMTP id x24mr29683123ejw.272.1605800200587; Thu, 19 Nov 2020 07:36:40 -0800 (PST)
MIME-Version: 1.0
References: <CAEGSd=DY8t8Skor+b6LSopzecoUUzUZhti9s0kdooLZGxPEt+w@mail.gmail.com>
In-Reply-To: <CAEGSd=DY8t8Skor+b6LSopzecoUUzUZhti9s0kdooLZGxPEt+w@mail.gmail.com>
From: Tom Herbert <tom@herbertland.com>
Date: Thu, 19 Nov 2020 08:36:29 -0700
Message-ID: <CALx6S35uEeWQK5-b4JTjE27T0CCwdaUESJJ585cC=OgR-D3j1w@mail.gmail.com>
To: Alexander Azimov <a.e.azimov@gmail.com>
Cc: tcpm <tcpm@ietf.org>, IPv6 Operations <v6ops@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/9n1RzmVmbXVoiHdfw1FgkiIs68E>
Subject: Re: [tcpm] [v6ops] Flow Label Load Balancing
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Nov 2020 15:36:45 -0000

On Thu, Nov 19, 2020 at 3:49 AM Alexander Azimov <a.e.azimov@gmail.com> wrote:
>
> Dear colleagues,
>
> I have added in the cc both v6op and tcpm for a reason and let me explain why.
>
> It's clear that we are moving forward with load balancing that uses flow label (FL). And the pressure will increase with SRv6 adoption. But at the moment wide adoption of FL-based load-balancing may create significant issues for TCP Anycast services.
>
> RFC6437 suggests putting hash from 5-tuple into FL value. And as far as I know, there is no document that updates this behavior. This description is perfectly fine, but what is implemented in the Linux kernel is different: FL is carrying hash from 5-tuple with an additional seed, and this seed is randomly changed after each RTO/SYN_RTO event. Here are related patches:
>
> https://lore.kernel.org/netdev/alpine.DEB.2.02.1407012100290.20628@tomh.mtv.corp.google.com/
> https://lore.kernel.org/netdev/1438124526-2129341-1-git-send-email-tom@herbertland.com/
> https://lore.kernel.org/netdev/20160928020337.3057238-1-brakmo@fb.com/
>
Alexander,

The initial intent of the Linux flow label code was that the hash was
persistent for the lifetime of the connection; in fact we fixed a
nasty bug where the hash would change when a connection moved to TW
state causing some firewalls not to see the final ACK so they can
remove state. The feature where the flow hash is recalculated for a
failing connection should be optional and not the default behavior. If
this isn't what the implementation is doing I can take it up on Linux
netdev to fix it.

Have you been able to show this behavior is actually happening on a
running system?

Tom

>
> This is a great thing by the way because in the data center environment with multiple equal paths it gives a way to have pseudo-multipath TCP which jumps between paths in case of an outage. There might be interest to write down an informational document for this.
>
> But the problem happens when TCP flows start jumping between anycast points. If anycast service provides connection tracking (L7/TCP proxy, stateful L4 balancer) the 'jump' may redirect traffic to another point, and with the adoption of FL load balancing by the transit providers - to another location, where no TCP state is available. In other words, an established TCP session breaks after RTO event. And it's not just a theory - we already faced this kind of issue in our network, and it is really hard to debug.
>
> I wonder what you think is a proper solution:
>
> Making FL related RTO change as knob instead of default behavior;
> Adding negotiation behavior in TCP;
> Something else?
>
> I'm looking forward to your advice. If there is a document that describes the above problem - please give me a reference.
>
> --
> Best regards,
> Alexander Azimov
> _______________________________________________
> v6ops mailing list
> v6ops@ietf.org
> https://www.ietf.org/mailman/listinfo/v6ops