Re: Pseudorandom Flow Labels

Thomas Narten <narten@us.ibm.com> Tue, 05 April 2011 19:57 UTC

Return-Path: <narten@us.ibm.com>
X-Original-To: ipv6@core3.amsl.com
Delivered-To: ipv6@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C91FE3A67A2 for <ipv6@core3.amsl.com>; Tue, 5 Apr 2011 12:57:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.69
X-Spam-Level:
X-Spam-Status: No, score=-106.69 tagged_above=-999 required=5 tests=[AWL=-0.091, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 88aTLixObWCh for <ipv6@core3.amsl.com>; Tue, 5 Apr 2011 12:57:03 -0700 (PDT)
Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.143]) by core3.amsl.com (Postfix) with ESMTP id BA4A23A6407 for <ipv6@ietf.org>; Tue, 5 Apr 2011 12:57:02 -0700 (PDT)
Received: from d01dlp01.pok.ibm.com (d01dlp01.pok.ibm.com [9.56.224.56]) by e3.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p35Jbmkm027350 for <ipv6@ietf.org>; Tue, 5 Apr 2011 15:37:48 -0400
Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 8E03338C8039 for <ipv6@ietf.org>; Tue, 5 Apr 2011 15:58:31 -0400 (EDT)
Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p35JwdHF438926 for <ipv6@ietf.org>; Tue, 5 Apr 2011 15:58:39 -0400
Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p35JwcUm027848 for <ipv6@ietf.org>; Tue, 5 Apr 2011 13:58:38 -0600
Received: from cichlid.raleigh.ibm.com (sig-9-65-200-167.mts.ibm.com [9.65.200.167]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p35Jwb7Q027805 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 5 Apr 2011 13:58:38 -0600
Received: from cichlid.raleigh.ibm.com (cichlid.raleigh.ibm.com [127.0.0.1]) by cichlid.raleigh.ibm.com (8.14.4/8.12.5) with ESMTP id p35JwaJP019044; Tue, 5 Apr 2011 15:58:36 -0400
Message-Id: <201104051958.p35JwaJP019044@cichlid.raleigh.ibm.com>
To: Shane Amante <shane@castlepoint.net>
Subject: Re: Pseudorandom Flow Labels
In-reply-to: <BD901061-96AC-4915-B7CE-2BC1F70861A5@castlepoint.net>
References: <BD901061-96AC-4915-B7CE-2BC1F70861A5@castlepoint.net>
Comments: In-reply-to Shane Amante <shane@castlepoint.net> message dated "Tue, 05 Apr 2011 13:22:16 -0600."
Date: Tue, 05 Apr 2011 15:58:36 -0400
From: Thomas Narten <narten@us.ibm.com>
X-Content-Scanned: Fidelis XPS MAILER
Cc: 6man List <ipv6@ietf.org>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ipv6>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Apr 2011 19:57:05 -0000

At the core, my concern with "pseudo random" is that I (as an
implementor) do not know what I have to do do satisfy that requirement
and I think in some cases that cost is higher than justified by the
benefit. The current documents do not provide enough concrete
guidance, IMO. If we want folk to actually set the Flow Label to a
good value, tell them what to do. Don't make them think too
hard. Provide them with a suggested algorithm. Don't just say "make it
psuedo random".

> 1) RFC 6056: Recommendations for Transport Protocol Randomization,
>   in particular Appendix A.  In short, a lot of [deployed]
>   implementations are already computing a pseudo-random value for a
>   transport protocol port (within the ephemeral port range), today,
>   so IMHO the implementation "burden" of choosing a pseudo-random
>   value must be quite low and, in theory, there should be good
>   re-use of that code/logic for selecting flow-labels.

This is good stuff. For originating hosts (not TEPs) then, the
documents should reference RFC 6056 (they currently do not), point out
that ports following the recommendation have the necessary property
already, and suggest a way of mapping the port numbers (and whatever
other info would be good) into the Flow Label to be used.

More to the point, if this could be done in a "stateless" manner,
e.g., by the routine that sends IP datagrams, that would be
nice. I.e., if the IP "send" routine could easily set the Flow Label
whenever it was zero, in a "stateless" manner, that would almost
guarantee we'd get wide spread implementation.

The problem I have with "pseudo random" is that it has properties that
I am not convinced the Flow Label necessarily needs to have. And the
complexity of getting real pseudo randomness in a simple
implementation may be too high.

I take it as a given that doing ECMP on the src/dst address gets you
80% of what you need today. Adding in the Flow Label (if set) will
take you much further. I am not convinced you need real "pseudo
randomness" in the Flow Label to get the benefits being called for.
	    
> 2) If we expect that if a intermediate router or switch is using
> *just* the 3-tuple of {src_ip, dst_ip + flow_label} as input-keys to
> compute a load-balancing hash algorithm, then the more random the
> flow-label, the better load-distribution of /all/ traffic will be
> across the LAG and/or ECMP paths.

Understood. But what you need for this is not (necessarily) pseudo
randomness. Just sufficient variability (when combined with the
src/dst) to get good distribution on the output side of the hash.

> a) As a network operator I have no control over the number of IP
>      addresses used by content farms or residential networks.
>      Furthermore, it's not clear that as more & more machines (and,
>      smartphones!) ship that support multiple CPU's and/or multiple
>      cores that each of those discrete computing units will (or,
>      should?) receive their own IP address.  With IPv6 that's at
>      least a possibly, but perhaps due to "ease of development",
>      that won't happen often, or at all.  If we strongly recommend a
>      pseudo-random flow-label and a device is only capable of
>      load-balancing using a 3-tuple, then I've at least got a
>      [relatively] unique flow-label in the packet header to provide
>      load distribution.

I think we can assume that if we use both the src/dst, we will get a
good degree of distribution in the values. Adding the Flow Label gives
more. I am just not convinced that to get good distribution we need to
*require* (or strongly suggest) psuedo randomness in the Flow
Lable. We know that by simply incrementing the Flow Label by 1 for
each flow, we get sufficient distribution. That is *way* easier to
implement than something else.

> b) The reason that we observe in today's production networks very
>      good load-distribution over LAG and/or ECMP paths is most
>      likely the result of, first, hosts selecting a pseudo-random
>      value for their ephemeral ports (based on RFC 6056) and,
>      second, the ability for intermediate routers/switches to use
>      the traditional 5-tuple for input-keys for load-distribution on
>      those paths.  Since we would like to move in a direction of
>      LAG/ECMP load-balancing based on just the 3-tuple of {src_ip,
>      dst_ip, flow_label}, then we should not take a step backwards
>      from where we are today wrt pseudorandom-ness of individual
>      flows.

This takes me back to an earlier point. Then suggest that if the port
numbers are pseudo random, provide me with a simple algorithm for
mapping that into a good Flow Label value.

> 3) Finally, if we expect that the flow-label may (or, hopefully
>  will) get used as a lightweight method of detecting and, possibly,
>  preventing 3rd-party DoS or traffic injection attacks, (i.e.:
>  draft-gont-6man-flowlabel-security), then it depends on generation
>  of pseudo-random values for flow-labels in order that off-path
>  attackers have a reasonably low chance of guessing a valid
>  flow-label.

I'm less convinced that there are real significant DOS attacks that
using a pseudo-random flow label can address.

Now. What about TEPs? They have no way of knowing whether the port
numbers being used provide proper randomness. That implies if they are
going to generate pseudo-random Flow Labels, they have a *lot* more
work to do. And to be sure that all packets from a given "flow" are
given the same Flow Label, they may have to maintain state. i.e., so
that subsequent packets from the same flow get assigned the same Flow
Label value. I doubt you are suggesting that. But the current
documents leave this all to the reader. Again, I think some concrete
recommendations are in order. If you think TEPs will, in fact, just
produce a Flow Label value based on whatever ports the packets being
tunneled contain, then just say so and be done with it. But don't
require that they produce "pseudo random" values if in fact, we are
pretty sure TEPs won't actually implement this.

Thomas