Re: [Int-area] [tsvwg] Is it feasible to perform fragmentation on UDP encapsulated packets.

Joe Touch <touch@isi.edu> Wed, 15 June 2016 20:13 UTC

Return-Path: <touch@isi.edu>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5808912DB4D; Wed, 15 Jun 2016 13:13:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.325
X-Spam-Level:
X-Spam-Status: No, score=-8.325 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-1.426] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 242GUliZqxNX; Wed, 15 Jun 2016 13:13:43 -0700 (PDT)
Received: from vapor.isi.edu (vapor.isi.edu [128.9.64.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5ACA612DB3F; Wed, 15 Jun 2016 13:13:43 -0700 (PDT)
Received: from [128.9.184.95] ([128.9.184.95]) (authenticated bits=0) by vapor.isi.edu (8.13.8/8.13.8) with ESMTP id u5FKCVX2010402 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Wed, 15 Jun 2016 13:12:32 -0700 (PDT)
To: Lloyd Wood <lloyd.wood@yahoo.co.uk>, Softwires WG <softwires@ietf.org>, "int-area@ietf.org" <int-area@ietf.org>, "lisp@ietf.org" <lisp@ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>, Xuxiaohu <xuxiaohu@huawei.com>, "nvo3@ietf.org" <nvo3@ietf.org>
References: <E83B905A-FF6D-4996-B71A-7921DE4B133B@ericsson.com> <BFC09F5C-D6DF-4B6B-AA95-03919B9F09FB@cisco.com> <573E2A0E.1060609@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D54EB60@NKGEML515-MBX.china.huawei.com> <573F453C.5060908@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D554B73@NKGEML515-MBS.china.huawei.com> <5743303C.5040109@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D55514C@NKGEML515-MBS.china.huawei.com> <5743DD16.3050506@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D555482@NKGEML515-MBS.china.huawei.com> <57448C14.2060203@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D5557DE@NKGEML515-MBS.china.huawei.com> <9c462520-eb8e-fcd0-0a08-228f80fbc779@isi.edu> <1FEE3F8F5CCDE64C9A8E8F4AD27C19EE0D5596E0@NKGEML515-MBS.china.huawei.com> <974977917.2492241.1465708308527.JavaMail.yahoo@mail.yahoo.com>
From: Joe Touch <touch@isi.edu>
Message-ID: <5761B6AF.8030107@isi.edu>
Date: Wed, 15 Jun 2016 13:12:31 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2
MIME-Version: 1.0
In-Reply-To: <974977917.2492241.1465708308527.JavaMail.yahoo@mail.yahoo.com>
Content-Type: multipart/alternative; boundary="------------030502040200010703010909"
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/f7dhLaynuTbnZWdJhLw9TWAIngE>
X-Mailman-Approved-At: Wed, 15 Jun 2016 15:25:06 -0700
Subject: Re: [Int-area] [tsvwg] Is it feasible to perform fragmentation on UDP encapsulated packets.
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: IETF Internet Area Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-area/>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Jun 2016 20:13:47 -0000


On 6/11/2016 10:11 PM, Lloyd Wood wrote:
> Fragmentation should be strongly discouraged.

Avoided where possible for sure.
>
> if you've designed a tunnelling solution and you have fragmentation
> happening as a matter of course, you've designed it wrong.

If you have a solution that recurses (X over X, with some intervening
layers or not), and X has a required minimum message size, then you
either have a solution that - in at least some non-trivial situations -
will fragment as a matter of course or a fantasy that cannot exist.

The only reason we avoid this in typical operation is by avoiding X over
X. Once you have a tunnel, that might not be avoidable (or even
detectable, depending on encryption).

Joe

>
> Lloyd Wood
> lloyd.wood@yahoo.co.uk
>
> On Friday, May 27, 2016, 8:10 PM, Xuxiaohu <xuxiaohu@huawei.com> wrote:
>
>     <Note that I have changed the subject of the email hence it has
>     nothing to do with the WG adoption call now. It's just a
>     discussion on a particular issue which is related to those WGs
>     which are working on UDP tunnels. The reason for containing the
>     old email is to use it as a background which may be useful for
>     better understanding of this particular issue>
>
>     The possible side-effect of performing fragmentation on UDP
>     encapsulated packets is to worsen the reassembly burden on tunnel
>     egress since fragments of UDP encapsulated packets are more likely
>     to be forwarded across different paths towards the tunnel egress
>     than those of IP or GRE encapsulated packets.
>
>     It seems that most X-over-UDP proposals choose to prohibit the
>     tunnel ingress from performing fragmentation on UDP encapsulated
>     packets. See the following quoted text regarding fragmentation
>     from those X-over-UDP drafts:
>
>     LISP:
>
>     When an ITR receives a packet from a site-facing interface and adds H
>       octets worth of encapsulation to yield a packet size greater than L
>       octets, it resolves the MTU issue by first splitting the original
>       packet into 2 equal-sized fragments.  A LISP header is then
>     prepended
>       to each fragment.
>
>     VXLAN:
>
>     VTEPs MUST NOT fragment VXLAN packets.  Intermediate routers may
>       fragment encapsulated VXLAN packets due to the larger frame size.
>       The destination VTEP MAY silently discard such VXLAN fragments.
>
>     VXLAN-GPE:
>
>     VTEPs MUST never fragment an encapsulated VXLAN GPE packet, and when
>       the outer IP header is IPv4, VTEPs MUST set the DF bit in the outer
>       IPv4 header.
>
>     GEVENE:
>
>       To prevent fragmentation and maximize performance, the best practice
>       when using Geneve is to ensure that the MTU of the physical network
>       is greater than or equal to the MTU of the encapsulated network plus
>       tunnel headers.
>
>     GUE:
>
>         If a packet is fragmented before encapsulation in GUE, all the
>         related fragments must be encapsulated using the same source port
>         (inner flow identifier). An operator may set MTU to account for
>         encapsulation overhead and reduce the likelihood of fragmentation.
>
>     GRE/UDP
>
>     Regarding packet fragmentation, an encapsulator/decapsulator SHOULD
>       be compliant with [RFC7588] and perform fragmentation before the
>       encapsulation.
>
>     However, the above choice seems conflict with the requirements as
>     described in https://tools.ietf.org/html/draft-ietf-intarea-tunnels-02
>
>
>     I wonder whether the IETF should reach a consensus on whether or
>     not the fragmentation on UDP encapsulated packets should be allowed.
>      
>     Best regards,
>     Xiaohu
>
>     > -----Original Message-----
>     > From: Xuxiaohu
>     > Sent: Thursday, May 26, 2016 4:35 PM
>     > To: 'Joe Touch'; Fred Baker (fred); Wassim Haddad
>     > Cc: int-area@ietf.org <javascript:return>
>     > Subject: RE: [Int-area] Call for adoption of
>     draft-xu-intarea-ip-in-udp-03
>     >
>     >
>     >
>     > > -----Original Message-----
>     > > From: Joe Touch [mailto:touch@isi.edu <javascript:return>]
>     > > Sent: Thursday, May 26, 2016 2:11 AM
>     > > To: Xuxiaohu; Fred Baker (fred); Wassim Haddad
>     > > Cc: int-area@ietf.org <javascript:return>
>     > > Subject: Re: [Int-area] Call for adoption of
>     > > draft-xu-intarea-ip-in-udp-03
>     > >
>     > >
>     > >
>     > > On 5/24/2016 7:24 PM, Xuxiaohu wrote:
>     > > > Hi Joe,
>     > > >
>     > > > I wonder whether you want to tell me the following truth by the
>     > > > example that you gave: no matter whatever improvements we
>     had done
>     > > > with this draft, those persons who dislike it by the light
>     of nature
>     > > > would dislike it in the end.
>     > >
>     > > The only improvements that would make this doc useful would be
>     to add
>     > > capabilities already in GRE/UDP or GUE/UDP, which we already have.
>     >
>     > Let's go over the four things you mentioned earlier in GRE/UDP
>     and GUE/UDP:
>     >
>     >     - stronger checksums
>     >
>     > In GRE/UDP, in order to use UDP-zero-checksum, it gave the following
>     > restrictions:
>     > " 6. UDP Checksum Handling
>     >
>     >    6.1. UDP Checksum with IPv4
>     >
>     >    For UDP in IPv4, the UDP checksum MUST be processed as
>     specified in
>     >    [RFC768] and [RFC1122] for both transmit and receive. The IPv4
>     >
>     >
>     >
>     > Yong, Crabber, Xu, Herbert                                   
>     [Page 12]
>     >
>     --------------------------------------------------------------------------------
>     >
>     > Internet-Draft          GRE-in-UDP Encapsulation           
>     March 2016
>     >
>     >    header includes a checksum which protects against mis-delivery of
>     >    the packet due to corruption of IP addresses. The UDP checksum
>     >    potentially provides protection against corruption of the UDP
>     header,
>     >    GRE header, and GRE payload. Disabling the use of checksums is a
>     >    deployment consideration that should take into account the
>     risk and
>     >    effects of packet corruption.
>     >
>     >    When a decapsulator receives a packet, the UDP checksum field
>     MUST
>     >    be processed. If the UDP checksum is non-zero, the
>     decapsulator MUST
>     >    verify the checksum before accepting the packet. By default a
>     >    decapsulator SHOULD accept UDP packets with a zero checksum.
>     A node
>     >    MAY be configured to disallow zero checksums per [RFC1122];
>     this may
>     >    be done selectively, for instance disallowing zero checksums from
>     >    certain hosts that are known to be sending over paths subject to
>     >    packet corruption. If verification of a non-zero checksum
>     fails, a
>     >    decapsulator lacks the capability to verify a non-zero
>     checksum, or
>     >    a packet with a zero-checksum was received and the
>     decapsulator is
>     >    configured to disallow, the packet MUST be dropped and an
>     event MAY
>     >    be logged.
>     >
>     >    6.2. UDP Checksum with IPv6
>     >
>     >    For UDP in IPv6, the UDP checksum MUST be processed as
>     specified in
>     >    [RFC768] and [RFC2460] for both transmit and receive.
>     >
>     >    When UDP is used over IPv6, the UDP checksum is relied upon to
>     >    protect both the IPv6 and UDP headers from corruption. As such, A
>     >    default GRE-in-UDP Tunnel MUST perform UDP checksum; A TMCE
>     GRE-in-
>     >    UDP Tunnel MAY be configured with the UDP zero-checksum mode
>     if the
>     >    traffic-managed controlled environment or a set of closely
>     >    cooperating traffic-managed controlled environments (such as by
>     >    network operators who have agreed to work together in order to
>     >    jointly provide specific services) meet at least one of following
>     >    conditions:
>     >
>     >    a. It is known (perhaps through knowledge of equipment types and
>     >      lower layer checks) that packet corruption is exceptionally
>     >      unlikely and where the operator is willing to take the risk of
>     >      undetected packet corruption.
>     >
>     >    b. It is judged through observational measurements (perhaps of
>     >      historic or current traffic flows that use a non-zero checksum)
>     >      that the level of packet corruption is tolerably low and where
>     >      the operator is willing to take the risk of undetected packet
>     >      corruption.
>     >
>     >
>     >
>     >
>     >
>     > Yong, Crabber, Xu, Herbert                                   
>     [Page 13]
>     >
>     --------------------------------------------------------------------------------
>     >
>     > Internet-Draft          GRE-in-UDP Encapsulation           
>     March 2016
>     >
>     >    c. Carrying applications that are tolerant of mis-delivered or
>     >      corrupted packets (perhaps through higher layer checksum,
>     >      validation, and retransmission or transmission redundancy)
>     where
>     >      the operator is willing to rely on the applications using the
>     >      tunnel to survive any corrupt packets.
>     >
>     >    The following requirements apply to a TMCE GRE-in-UDP tunnel that
>     >    use UDP zero-checksum mode:
>     >
>     >      a. Use of the UDP checksum with IPv6 MUST be the default
>     >        configuration of all GRE-in-UDP tunnels.
>     >
>     >      b. The GRE-in-UDP tunnel implementation MUST comply with all
>     >        requirements specified in Section 4 of [RFC6936] and with
>     >        requirement 1 specified in Section 5 of [RFC6936].
>     >
>     >      c. The tunnel decapsulator SHOULD only allow the use of UDP
>     zero-
>     >        checksum mode for IPv6 on a single received UDP Destination
>     >        Port regardless of the encapsulator. The motivation for this
>     >        requirement is possible corruption of the UDP Destination
>     Port,
>     >        which may cause packet delivery to the wrong UDP port. If
>     that
>     >        other UDP port requires the UDP checksum, the mis-delivered
>     >        packet will be discarded.
>     >
>     >      d. It is RECOMMENDED that the UDP zero-checksum mode for
>     IPv6 is
>     >        only enabled for certain selected source addresses. The
>     tunnel
>     >        decapsulator MUST check that the source and destination IPv6
>     >        addresses are valid for the GRE-in-UDP tunnel on which the
>     >        packet was received if that tunnel uses UDP zero-checksum
>     mode
>     >        and discard any packet for which this check fails.
>     >
>     >      e. The tunnel encapsulator SHOULD use different IPv6
>     addresses for
>     >        each GRE-in-UDP tunnel that uses UDP zero-checksum mode
>     >        regardless of the decapsulator in order to strengthen the
>     >        decapsulator's check of the IPv6 source address (i.e.,
>     the same
>     >        IPv6 source address SHOULD NOT be used with more than one
>     IPv6
>     >        destination address, independent of whether that destination
>     >        address is a unicast or multicast address). When this is not
>     >        possible, it is RECOMMENDED to use each source IPv6
>     address for
>     >        as few UDP zero-checksum mode GRE-in-UDP tunnels as is
>     feasible.
>     >
>     >      f. When any middlebox exists on the path of a GRE-in-UDP
>     tunnel,
>     >        it is RECOMMENDED to use the default mode, i.e. use UDP
>     >        checksum, to reduce the chance that the encapsulated
>     packets to
>     >        be dropped.
>     >
>     >
>     >
>     >
>     >
>     > Yong, Crabber, Xu, Herbert                                   
>     [Page 14]
>     >
>     --------------------------------------------------------------------------------
>     >
>     > Internet-Draft          GRE-in-UDP Encapsulation           
>     March 2016
>     >
>     >      g. Any middlebox that allows the UDP zero-checksum mode for
>     IPv6
>     >        MUST comply with requirement 1 and 8-10 in Section 5 of
>     >        [RFC6936].
>     >
>     >      h. Measures SHOULD be taken to prevent IPv6 traffic with
>     zero UDP
>     >        checksums from "escaping" to the general Internet; see
>     Section
>     >        8 for examples of such measures.
>     >
>     >      i. IPv6 traffic with zero UDP checksums MUST be actively
>     monitored
>     >        for errors by the network operator. For example, the operator
>     >        may monitor Ethernet layer packet error rates.
>     >
>     >      j. If a packet with a non-zero checksum is received, the
>     checksum
>     >        MUST be verified before accepting the packet. This is
>     >        regardless of whether the tunnel encapsulator and
>     decapsulator
>     >        have been configured with UDP zero-checksum mode.
>     >
>     >    The above requirements do not change either the requirements
>     >    specified in [RFC2460] as modified by [RFC6935] or the
>     requirements
>     >    specified in [RFC6936].
>     >
>     >    The requirement to check the source IPv6 address in addition
>     to the
>     >    destination IPv6 address, plus the strong recommendation against
>     >    reuse of source IPv6 addresses among GRE-in-UDP tunnels
>     collectively
>     >    provide some mitigation for the absence of UDP checksum
>     coverage of
>     >    the IPv6 header. A traffic-managed controlled environment that
>     >    satisfies at least one of three conditions listed above in this
>     >    section provides additional assurance.
>     >
>     >    A GRE-in-UDP tunnel is suitable for transmission over lower
>     layers
>     >    in the traffic-managed controlled environments that are
>     allowed by
>     >    the exceptions stated above and the rate of corruption of the
>     inner
>     >    IP packet on such networks is not expected to increase by
>     comparison
>     >    to GRE traffic that is not encapsulated in UDP.  For these
>     reasons,
>     >    GRE-in-UDP does not provide an additional integrity check except
>     >    when GRE checksum is used when UDP zero-checksum mode is used
>     with
>     >    IPv6, and this design is in accordance with requirements 2, 3
>     and 5
>     >    specified in Section 5 of [RFC6936].
>     >
>     >    Generic Router Encapsulation (GRE) does not accumulate incorrect
>     >    state as a consequence of GRE header corruption. A corrupt GRE
>     >    packet may result in either packet discard or forwarding of the
>     >    packet without accumulation of GRE state. Active monitoring
>     of GRE-
>     >    in-UDP traffic for errors is REQUIRED as occurrence of errors
>     will
>     >    result in some accumulation of error information outside the
>     >    protocol for operational and management purposes. This design
>     is in
>     >    accordance with requirement 4 specified in Section 5 of
>     [RFC6936].
>     >
>     >
>     >
>     > Yong, Crabber, Xu, Herbert                                   
>     [Page 15]
>     >
>     --------------------------------------------------------------------------------
>     >
>     > Internet-Draft          GRE-in-UDP Encapsulation           
>     March 2016
>     >
>     >    The remaining requirements specified in Section 5 of
>     [RFC6936] are
>     >    not applicable to GRE-in-UDP.  Requirements 6 and 7 do not apply
>     >    because GRE does not include a control feedback mechanism.
>     >    Requirements 8-10 are middlebox requirements that do not apply to
>     >    GRE-in-UDP tunnel endpoints (see Section 7.1 for further
>     middlebox
>     >    discussion).
>     >
>     >    It is worth mentioning that the use of a zero UDP checksum should
>     >    present the equivalent risk of undetected packet corruption when
>     >    sending similar packet using GRE-in-IPv6 without UDP
>     [RFC7676] and
>     >    without GRE checksums.
>     >
>     >    In summary, a TMCE GRE-in-UDP Tunnel is allowed to use UDP-zero-
>     >    checksum mode for IPv6 when the conditions and requirements
>     stated
>     >    above are met. Otherwise the UDP checksum need to be used for
>     IPv6
>     >    as specified in [RFC768] and [RFC2460]. Use of GRE checksum is
>     >    RECOMMENED when the UDP checksum is not used.
>     > "
>     >
>     > In GUE, to support UDP-checksum-zero, it said
>     >
>     > " Therefore, when GUE is used over
>     >    IPv6, either the UDP checksum must be enabled or the GUE header
>     >    checksum must be used.  An encapsulator MAY set a zero UDP
>     checksum
>     >    for performance or implementation reasons, in which case the GUE
>     >    header checksum MUST be used or applicable requirements for using
>     >    zero UDP checksums in [GREUDP] MUST be met. If the UDP
>     checksum is
>     >    enabled, then the GUE header checksum should not be used
>     since it is
>     >    mostly redundant."
>     >
>     > It's easy for me to add the similar words to the IP-in-UDP draft
>     like "the
>     > applicable requirements for using zero UDP checksum in [GREUDP]
>     MUST be
>     > met when zero UDP checksum is used by the tunnel ingress".
>     However, the
>     > major goal for disabling the UDP checksum is to improve the
>     performance.
>     > When GUE header checksum is used and/or the bunch of applicable
>     > requirements as described in GRE/UDP are verified, is the goal
>     of improving
>     > performance still achievable? If not, why not directly enable
>     the UDP-checksum
>     > instead?
>     >
>     >     - fragmentation support
>     >
>     > In GRE/UDP, it said
>     >
>     > " 4.1. MTU and Fragmentation
>     >
>     >    Regarding packet fragmentation, an encapsulator/decapsulator
>     SHOULD
>     >    be compliant with [RFC7588] and perform fragmentation before the
>     >    encapsulation. The size of fragments SHOULD be less or equal
>     to the
>     >    PMTU associated with the path between the GRE ingress and the GRE
>     >    egress tunnel endpoints minus the GRE and UDP overhead ..."
>     >
>     > in GUE, it said
>     >
>     > " 4.9. MTU and fragmentation
>     >
>     >    Standard conventions for handling of MTU (Maximum
>     Transmission Unit)
>     >    and fragmentation in conjunction with networking tunnels
>     >    (encapsulation of layer 2 or layer 3 packets) should be followed.
>     >    Details are described in MTU and Fragmentation Issues with
>     In-the-
>     >    Network Tunneling [RFC4459]... "
>     >
>     > It seems that the only missing thing in the IP-in-UDP draft is
>     to allow the outer
>     > fragmentation. However, as it said in
>     >
>     (https://tools.ietf.org/html/draft-ietf-intarea-tunnels-02#page-13),
>     " ...IPsec
>     > performs only Outer Fragmentation; this distinguishes it from
>     IP-in-IP, which
>     > performs only Inner Fragmentation. " Note that IP-in-IP is the
>     dominant
>     > encapsulation choice within Softwires networks. In other words,
>     performing
>     > only inner fragmentation works very well in practice.
>     Furthermore, the outer
>     > fragmentation issue (e.g., reassembly cost for the egress) would
>     become even
>     > worse since the fragments of X-in-UDP packets are more likely to
>     be forwarded
>     > across different paths than those of X-in-IP and X-in-GRE
>     packets. Hence, I'm
>     > wondering whether it's worthwhile to support the outer
>     fragmentation on UDP
>     > encapsulated packets which seems useless in practice.
>     >
>     >     - signalling support (e.g., to test whether a tunnel is up or
>     >     to measure MTUs)
>     >
>     > I haven't found any description of this in both GRE/UDP and GUE.
>     Did you?
>     >
>     >     - support for robust ID fields (related to fragmentation,
>     >     e.g., to overcome the limits of IPv4 ID as per RFC 6864)
>     >
>     > I haven't found any description of this in both GRE/UDP and GUE.
>     Did you?
>     >
>     > Xiaohu
>     >
>     > > It is not our obligation to find a way for your document to
>     proceed -
>     > > that onus is on you.
>     > >
>     > > Joe
>