Re: [v6ops] Flow Label Load Balancing

Tom,

As far as I know, the auto_flowlabels default is 1, which stands for:

> 1: automatic flow labels are enabled by default, they can be disabled on a
> per socket basis using the IPV6_AUTOFLOWLABEL socket option

And you get it all-in-one: FL changes after both SYN_RTO and RTO events.
The change after SYN_RTO looks safe and can bring benefit for the end-user
experience, while the FL change after the RTO outside of a controlled
environment is causing a problem.

And yes, we were able to detect it on a working system. One of the
developers contacted NOC team that he has constant connection failures
while pushing a significant amount of data, and it began when he started
working from home. It proved out that there was RTO happening during the
lifetime of the TCP session, the session jumped to another stateful L4
balancer and, as a result, the session got a timeout. At the moment we had
to disable FL load balancing at our border routers and DCI - this solved
the issue at the moment.

If it is possible to remove from the default behavior change of flow label
after RTO event, while keeping it for SYN_RTO would be great.

чт, 19 нояб. 2020 г. в 18:36, Tom Herbert <tom@herbertland.com>:

> On Thu, Nov 19, 2020 at 3:49 AM Alexander Azimov <a.e.azimov@gmail.com>
> wrote:
> >
> > Dear colleagues,
> >
> > I have added in the cc both v6op and tcpm for a reason and let me
> explain why.
> >
> > It's clear that we are moving forward with load balancing that uses flow
> label (FL). And the pressure will increase with SRv6 adoption. But at the
> moment wide adoption of FL-based load-balancing may create significant
> issues for TCP Anycast services.
> >
> > RFC6437 suggests putting hash from 5-tuple into FL value. And as far as
> I know, there is no document that updates this behavior. This description
> is perfectly fine, but what is implemented in the Linux kernel is
> different: FL is carrying hash from 5-tuple with an additional seed, and
> this seed is randomly changed after each RTO/SYN_RTO event. Here are
> related patches:
> >
> >
> https://lore.kernel.org/netdev/alpine.DEB.2.02.1407012100290.20628@tomh.mtv.corp.google.com/
> >
> https://lore.kernel.org/netdev/1438124526-2129341-1-git-send-email-tom@herbertland.com/
> > https://lore.kernel.org/netdev/20160928020337.3057238-1-brakmo@fb.com/
> >
> Alexander,
>
> The initial intent of the Linux flow label code was that the hash was
> persistent for the lifetime of the connection; in fact we fixed a
> nasty bug where the hash would change when a connection moved to TW
> state causing some firewalls not to see the final ACK so they can
> remove state. The feature where the flow hash is recalculated for a
> failing connection should be optional and not the default behavior. If
> this isn't what the implementation is doing I can take it up on Linux
> netdev to fix it.
>
> Have you been able to show this behavior is actually happening on a
> running system?
>
> Tom
>
> >
> > This is a great thing by the way because in the data center environment
> with multiple equal paths it gives a way to have pseudo-multipath TCP which
> jumps between paths in case of an outage. There might be interest to write
> down an informational document for this.
> >
> > But the problem happens when TCP flows start jumping between anycast
> points. If anycast service provides connection tracking (L7/TCP proxy,
> stateful L4 balancer) the 'jump' may redirect traffic to another point, and
> with the adoption of FL load balancing by the transit providers - to
> another location, where no TCP state is available. In other words, an
> established TCP session breaks after RTO event. And it's not just a theory
> - we already faced this kind of issue in our network, and it is really hard
> to debug.
> >
> > I wonder what you think is a proper solution:
> >
> > Making FL related RTO change as knob instead of default behavior;
> > Adding negotiation behavior in TCP;
> > Something else?
> >
> > I'm looking forward to your advice. If there is a document that
> describes the above problem - please give me a reference.
> >
> > --
> > Best regards,
> > Alexander Azimov
> > _______________________________________________
> > v6ops mailing list
> > v6ops@ietf.org
> > https://www.ietf.org/mailman/listinfo/v6ops
>

-- 
Best regards,
Alexander Azimov