Re: [v6ops] Flow Label Load Balancing

Tom Herbert <tom@herbertland.com> Tue, 24 November 2020 20:00 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3CBC13A097D for <v6ops@ietfa.amsl.com>; Tue, 24 Nov 2020 12:00:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EmWdhtUVbSth for <v6ops@ietfa.amsl.com>; Tue, 24 Nov 2020 11:59:59 -0800 (PST)
Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 44FC73A08F6 for <v6ops@ietf.org>; Tue, 24 Nov 2020 11:59:59 -0800 (PST)
Received: by mail-ed1-x52b.google.com with SMTP id k4so115530edl.0 for <v6ops@ietf.org>; Tue, 24 Nov 2020 11:59:59 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=v9qkRwTfoCBHPKCZzc5O5Y/ldG9QNYn+YcU6f8uxqr4=; b=kQhmeAmeFd+WKt4xHlJsXCB3LXhm6A+7N2IH3n89b95q6V/RPHxdfyR6L614ayw6vx RnKnt8JpPnY0uKJbM5klgdc4z7etJKzuV1juS7lZWDGKJYfVShxN3/Za0K2MpGa5vOCm XK/W23CtyglYxdkvK4TtcDOL+YyjK0DvJyGAyqApeQ3VgtuyY93Lnf4hANRrrA/taupM +1dad4RyQ8Y8YwaWvNDkAYVb3fzsxpv0L9c1V++bpACq1HNKUc2QGKdyR8vpUfrQuPeF 0WNiR+qxKVyGl8k/hLrW2gFjy+LCaDxCQyN2/7bb9eXrl25DHeJRitDODI4qNFrHKCpR QOLw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=v9qkRwTfoCBHPKCZzc5O5Y/ldG9QNYn+YcU6f8uxqr4=; b=kLoD+9CuAIQ9G9WFnw6X7os8YhOFZ3bJ+J4CHlxah0mI1hTfWWDAYX6TnCkH8MHe+a Li0M8gbMOW7swVEMU9q0Npsb3e9lkbBuPZtlEEDmIc/1fBf+sASswAHivPodMXuyvNHJ MqdiUukbMZcf+yx/eLUFQfgTjjLF2fzRrU6JoMmZw7xCnllXDXoTPY8s4A5xaoHPvlzY 3ujMXAt2hKL13pONb7QL1gtjCp5UzeDIGOAYtUxG1F83sdK6L4bzgtewoHPi9+vILA+P Q5Fi3GIxn2tN8C8vUYiVVtyoN2XwTiiU1RD9rrcom7TNF8iTbhT4PvO3CR3xxvEdABw0 OWZQ==
X-Gm-Message-State: AOAM531JW55HHmVXkHkQZb8tH7JKgyjTw+zJ91zd98JTdREc7vTuEuO1 BmdVnpwenqFxBVoVKa2D0mMwseWF4dZygHKjkhbB2g==
X-Google-Smtp-Source: ABdhPJwbpZfa9c+WP7F5mmDQAcP/BL3v0oiq1XtN3FWFonl3mtwYUWf6fqoQiwNbswdL2ZV3FUsIaRmC9hg1i7B47Xw=
X-Received: by 2002:aa7:dc49:: with SMTP id g9mr105933edu.383.1606247997550; Tue, 24 Nov 2020 11:59:57 -0800 (PST)
MIME-Version: 1.0
References: <CAEGSd=DY8t8Skor+b6LSopzecoUUzUZhti9s0kdooLZGxPEt+w@mail.gmail.com> <d29042a7-742b-a445-cf60-2773e5515ae5@gont.com.ar> <CAEGSd=AB5DMopq5Hc0ydZwP+xQuwxNBHuFSpCPcZvnaZbJfRoQ@mail.gmail.com> <fc7693e8-a57b-004d-a019-159060c6feef@gmail.com>
In-Reply-To: <fc7693e8-a57b-004d-a019-159060c6feef@gmail.com>
From: Tom Herbert <tom@herbertland.com>
Date: Tue, 24 Nov 2020 12:59:46 -0700
Message-ID: <CALx6S36pxMi1mWia4bJ-kWa_K58XAxjOGLiR_NF-oCtQzLSbBw@mail.gmail.com>
To: Brian E Carpenter <brian.e.carpenter@gmail.com>
Cc: Alexander Azimov <a.e.azimov@gmail.com>, Fernando Gont <fernando@gont.com.ar>, IPv6 Operations <v6ops@ietf.org>, tcpm <tcpm@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/Ecc2aYKdKNGth59Sot-IQFlGDuM>
Subject: Re: [v6ops] Flow Label Load Balancing
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Nov 2020 20:00:04 -0000

On Tue, Nov 24, 2020 at 12:46 PM Brian E Carpenter
<brian.e.carpenter@gmail.com> wrote:
>
> On 25-Nov-20 08:28, Alexander Azimov wrote:
> > Hi Fernando,
> >
> > Stating that FL change during TCP session lifetime is a bug - is a bit harsh.
>
> It's a bug. It definitely breaks server load balancing, which in practice is more important than en route load balancing, because it will probably switch your session to a different server. (OK, I know that there are mechanisms sometimes for rescuing a session that has unexpectedly moved from one server to another, but in general it's a fail.)
>
Brian,

IMO it's not that this breaks server load balancing, it's really that
server load balancing and other stateful mechanisms break multipath,
multihoming, and adaptive routing! In any case, I would agree that in
the current Internet it is pragmatic in the current Internet to aim
for consistent routing as the default.

Tom

> I assume you have read https://tools.ietf.org/html/rfc7098
>
> Regards
>     Brian
>
> >
> > It is a fantastic idea to change the FL value if RTO or SYN_RTO happens in a controlled environment.
> > These are very specific TCP timeouts, that provide enough guarantee that there will be no out-of-order packets, though your packets will reach the destination even in case of an outage inside your network. Zero influence on your services in case of the network outage - doesn't sound like a bug for me.
> >
> > The problem is with the current Linux (though I haven't checked other OSes) defaults. Specifically, with the default behavior after RTO event.
> >
> > вт, 24 нояб. 2020 г. в 22:06, Fernando Gont <fernando@gont.com.ar <mailto:fernando@gont.com.ar>>:
> >
> >     On 19/11/20 07:48, Alexander Azimov wrote:
> >     > Dear colleagues,
> >     >
> >     > I have added in the cc both v6op and tcpm for a reason and let me
> >     > explain why.
> >     >
> >     > It's clear that we are moving forward with load balancing that uses flow
> >     > label (FL). And the pressure will increase with SRv6 adoption. But at
> >     > the moment wide adoption of FL-based load-balancing may create
> >     > significant issues for TCP Anycast services.
> >     >
> >     > RFC6437 suggests putting hash from 5-tuple into FL value. And as far as
> >     > I know, there is no document that updates this behavior. This
> >     > description is perfectly fine, but what is implemented in the Linux
> >     > kernel is different: FL is carrying hash from 5-tuple with an additional
> >     > seed, and this seed is randomly changed after each RTO/SYN_RTO event.
> >
> >     Changing the FL upon RTO is a bug.
> >
> >     I guess/assume that when you say SYN-RTO, you really mean "user
> >     timeout", rather than RTO. If you don't, then that's also a bug.
> >     If you do, I fail to understand what's the reason for wanting the FL to
> >     change in that case, because as a result of port randomization, it 0s
> >     unlikely that the same four-tuple is employed for the next connection retry.
> >
> >
> >     > Here are related patches:
> >     >
> >     >   * https://lore.kernel.org/netdev/alpine.DEB.2.02.1407012100290.20628@tomh.mtv.corp.google.com/
> >     >   * https://lore.kernel.org/netdev/1438124526-2129341-1-git-send-email-tom@herbertland.com/
> >     >   * https://lore.kernel.org/netdev/20160928020337.3057238-1-brakmo@fb.com/
> >     >
> >     >
> >     > This is a great thing by the way because in the data center
> >     > environment with multiple equal paths it gives a way to have
> >     > pseudo-multipath TCP which jumps between paths in case of an outage.
> >     > There might be interest to writedown an informational document for this.
> >
> >     That's a bad idea, since specs-wise the Flow-Label is not guaranteed to
> >     remain unchanged from source to destination. If you want to ahve
> >     multiple paths, then you should implement that in routing.
> >
> >
> >
> >     > I wonder what you think is a proper solution:
> >     >
> >     >   * Making FL related RTO change as knob instead of default behavior;
> >     >   * Adding negotiation behavior in TCP;
> >     >   * Something else?
> >
> >     Just make the FL a function of the connection "identifier". And keeo it
> >     constant for the lifetime of that conenction.
> >
> >     Thanks,
> >     --
> >     Fernando Gont
> >     e-mail: fernando@gont.com.ar <mailto:fernando@gont.com.ar> || fgont@si6networks.com <mailto:fgont@si6networks.com>
> >     PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
> >
> >
> >
> >
> >
> > --
> > Best regards,
> > Alexander Azimov
> >
> > _______________________________________________
> > v6ops mailing list
> > v6ops@ietf.org
> > https://www.ietf.org/mailman/listinfo/v6ops
> >
>
> _______________________________________________
> v6ops mailing list
> v6ops@ietf.org
> https://www.ietf.org/mailman/listinfo/v6ops