Re: [Int-area] [Softwires] Is it feasible to perform fragmentation on UDP encapsulated packets.

otroan@employees.org Fri, 27 May 2016 10:50 UTC

Return-Path: <otroan@employees.org>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 259ED12D87B; Fri, 27 May 2016 03:50:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.566
X-Spam-Level:
X-Spam-Status: No, score=-0.566 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_SORBS_WEB=0.77, SPF_HELO_PASS=-0.001, SPF_SOFTFAIL=0.665] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=employees.org; domainkeys=pass (1024-bit key) header.from=otroan@employees.org header.d=employees.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dEJgGVU9iCcs; Fri, 27 May 2016 03:50:15 -0700 (PDT)
Received: from incoming.kjsl.com (inbound02.kjsl.com [IPv6:2001:1868:2002::144]) (using TLSv1.2 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A6B5112D5EC; Fri, 27 May 2016 03:50:13 -0700 (PDT)
Received: from cowbell.employees.org ([65.50.211.142]) by ironport02.kjsl.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 May 2016 10:50:11 +0000
Received: from cowbell.employees.org (localhost [127.0.0.1]) by cowbell.employees.org (Postfix) with ESMTP id 118F89CC51; Fri, 27 May 2016 03:50:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=employees.org; h=subject :mime-version:content-type:from:in-reply-to:date:cc:message-id :references:to; s=selector1; bh=hp37xpAyMZuOe50rqKXw1YvKtbE=; b= HjXZb/oaZXP751QDZsMbpHB4jnFi006pL1+DxHQkGbPeetSxg6Kvm7ZPnEoAPhls nQ1VwzDTFUka2mXEspTrGHLUSefs7NZEpCKetJhskzNSAyr0Y3RE865mkCLdMAx4 RH8GSd7tiYCKkdy+XyE3TJ41BlsSPV+KFF+GR+mAxGA=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=employees.org; h=subject :mime-version:content-type:from:in-reply-to:date:cc:message-id :references:to; q=dns; s=selector1; b=hyw09oJstpUuWrSmqBccsdj/eP jZL3q08oJqv6dUT61jNz8YochRWGRu4mZT02/gUFdHmE5gF9y0Cm5bAtXfsYQ3fk lTJVbPiSvsoR00+vxRP/cfwxIeIysysUdJbnmCcHeiZIt1TTNxd+toXV0TqaWM+1 kLxE9l70zmjsdTlXM=
Received: from h.hanazo.no (cm-84.213.17.83.getinternet.no [84.213.17.83]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: otroan) by cowbell.employees.org (Postfix) with ESMTPSA id 9626E9CC4E; Fri, 27 May 2016 03:50:10 -0700 (PDT)
Received: from [IPv6:::1] (localhost [IPv6:::1]) by h.hanazo.no (Postfix) with ESMTP id 639831735474; Fri, 27 May 2016 12:50:05 +0200 (CEST)
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
Content-Type: multipart/signed; boundary="Apple-Mail=_C96DD3CC-A517-4426-AD32-579B610AA543"; protocol="application/pgp-signature"; micalg="pgp-sha512"
X-Pgp-Agent: GPGMail 2.6b2
From: otroan@employees.org
In-Reply-To: <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D5596E0@NKGEML515-MBS.china.huawei.com>
Date: Fri, 27 May 2016 12:50:02 +0200
Message-Id: <8790AF6F-CCD6-43AC-A50E-957B037643F1@employees.org>
References: <E83B905A-FF6D-4996-B71A-7921DE4B133B@ericsson.com> <BFC09F5C-D6DF-4B6B-AA95-03919B9F09FB@cisco.com> <573E2A0E.1060609@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D54EB60@NKGEML515-MBX.china.huawei.com> <573F453C.5060908@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D554B73@NKGEML515-MBS.china.huawei.com> <5743303C.5040109@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D55514C@NKGEML515-MBS.china.huawei.com> <5743DD16.3050506@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D555482@NKGEML515-MBS.china.huawei.com> <57448C14.2060203@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D5557DE@NKGEML515-MBS.china.huawei.com> <9c462520-eb8e-fcd0-0a08-228f80fbc779@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D5596E0@NKGEML515-MBS.china.huawei.com>
To: Xuxiaohu <xuxiaohu@huawei.com>
X-Mailer: Apple Mail (2.3124)
Archived-At: <http://mailarchive.ietf.org/arch/msg/int-area/pFzp5w8ZQ7Fg-nUjZvnIy1vYJE4>
Cc: Softwires WG <softwires@ietf.org>, "nvo3@ietf.org" <nvo3@ietf.org>, "int-area@ietf.org" <int-area@ietf.org>, "lisp@ietf.org" <lisp@ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Subject: Re: [Int-area] [Softwires] Is it feasible to perform fragmentation on UDP encapsulated packets.
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: IETF Internet Area Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-area/>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 May 2016 10:50:17 -0000

> <Note that I have changed the subject of the email hence it has nothing to do with the WG adoption call now. It's just a discussion on a particular issue which is related to those WGs which are working on UDP tunnels. The reason for containing the old email is to use it as a background which may be useful for better understanding of this particular issue>
> 
> The possible side-effect of performing fragmentation on UDP encapsulated packets is to worsen the reassembly burden on tunnel egress since fragments of UDP encapsulated packets are more likely to be forwarded across different paths towards the tunnel egress than those of IP or GRE encapsulated packets.
> 
> It seems that most X-over-UDP proposals choose to prohibit the tunnel ingress from performing fragmentation on UDP encapsulated packets. See the following quoted text regarding fragmentation from those X-over-UDP drafts:
> 
> LISP:
> 
> When an ITR receives a packet from a site-facing interface and adds H
>   octets worth of encapsulation to yield a packet size greater than L
>   octets, it resolves the MTU issue by first splitting the original
>   packet into 2 equal-sized fragments.  A LISP header is then prepended
>   to each fragment.
> 
> VXLAN:
> 
> VTEPs MUST NOT fragment VXLAN packets.  Intermediate routers may
>   fragment encapsulated VXLAN packets due to the larger frame size.
>   The destination VTEP MAY silently discard such VXLAN fragments.
> 
> VXLAN-GPE:
> 
> VTEPs MUST never fragment an encapsulated VXLAN GPE packet, and when
>   the outer IP header is IPv4, VTEPs MUST set the DF bit in the outer
>   IPv4 header.
> 
> GEVENE:
> 
>   To prevent fragmentation and maximize performance, the best practice
>   when using Geneve is to ensure that the MTU of the physical network
>   is greater than or equal to the MTU of the encapsulated network plus
>   tunnel headers.
> 
> GUE:
> 
>    If a packet is fragmented before encapsulation in GUE, all the
>    related fragments must be encapsulated using the same source port
>    (inner flow identifier). An operator may set MTU to account for
>    encapsulation overhead and reduce the likelihood of fragmentation.
> 
> GRE/UDP
> 
> Regarding packet fragmentation, an encapsulator/decapsulator SHOULD
>   be compliant with [RFC7588] and perform fragmentation before the
>   encapsulation.
> 
> However, the above choice seems conflict with the requirements as described in https://tools.ietf.org/html/draft-ietf-intarea-tunnels-02
> 
> 
> I wonder whether the IETF should reach a consensus on whether or not the fragmentation on UDP encapsulated packets should be allowed.

Having just implemented fragmentation and reassembly support in a MAP BR...

MAP does IPv4 over IPv6 tunnels with shared IPv4 address (aka routing on ports).

- Inner IPv4 fragmentation is relatively easy, cause you don't need to reassemble you can do virtual reassembly
- Outer IPv6 reassembly is much harder, cause you have to do it on the tunnel egress.

It is not possible to implement reassembly complying with IETF RFCs.
The IPv4 reassembly buffer is specified in IPv4 as 15 seconds, in IPv6 as 60 seconds.
My implementation does upwards of 100Mpps, you do the numbers.

If you can:
 - Guarantee always in sequence
 - Maximum fragment chain of 2
 - Maximum time and packet gap between first and last fragment of _very_small_

Then sure it is implementable, but if not then you're just setting yourself up for a DOS attack.
And if you have that much control over the environment, just increase the MTU in the tunnel domain.

Code is here:
https://git.fd.io/cgit/vpp/tree/vnet/vnet/map/ip6_map.c#n530

So in short, IETF can say whatever they like, that's not going to change reality.

Best regards,
Ole