Re: [Int-area] Proxy function for PTB messages on the tunnel end

Vasilenko Eduard <vasilenko.eduard@huawei.com> Wed, 24 March 2021 09:01 UTC

Return-Path: <vasilenko.eduard@huawei.com>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 85B4B3A27CD; Wed, 24 Mar 2021 02:01:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DfR41Mvt0boG; Wed, 24 Mar 2021 02:01:04 -0700 (PDT)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CB4313A27CE; Wed, 24 Mar 2021 02:01:03 -0700 (PDT)
Received: from fraeml745-chm.china.huawei.com (unknown [172.18.147.206]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4F529M6wRdz67yj4; Wed, 24 Mar 2021 16:54:31 +0800 (CST)
Received: from msceml704-chm.china.huawei.com (10.219.141.143) by fraeml745-chm.china.huawei.com (10.206.15.226) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Wed, 24 Mar 2021 10:01:00 +0100
Received: from msceml703-chm.china.huawei.com (10.219.141.161) by msceml704-chm.china.huawei.com (10.219.141.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Wed, 24 Mar 2021 12:01:00 +0300
Received: from msceml703-chm.china.huawei.com ([10.219.141.161]) by msceml703-chm.china.huawei.com ([10.219.141.161]) with mapi id 15.01.2106.013; Wed, 24 Mar 2021 12:00:59 +0300
From: Vasilenko Eduard <vasilenko.eduard@huawei.com>
To: Joseph Touch <touch@strayalpha.com>
CC: "v6ops@ietf.org" <v6ops@ietf.org>, int-area <int-area@ietf.org>
Thread-Topic: Proxy function for PTB messages on the tunnel end
Thread-Index: AdcfDpZejD7P5RAGQ06oVS2C5lk8jAACE+sAAAi5WeD//90TAP//s51QgAByGQD//8LNQIAAUjUA//8J3tAARFNWgP//xB4Q//+RtwD//uUYkP/962+A//ucEtD/91rKgP/txlXQ
Date: Wed, 24 Mar 2021 09:00:59 +0000
Message-ID: <d2dffa85fdbc476f95c008a41e65e696@huawei.com>
References: <0b61deabe8f3420eba1b5794b024e914@huawei.com> <A063E98C-0D6C-49B2-B871-E2B39A097FD5@strayalpha.com> <37059faadd6e441cb98f6ec7e01ecef9@huawei.com> <9D23C833-46C5-4B93-A204-D2D4F54689DF@strayalpha.com> <1e6ecd3b468d4255bda65d519190135d@huawei.com> <3B48413C-A47D-4F3F-B9E4-7ED4D33AA66B@strayalpha.com> <22bb7bf129694ccfbbad441d8d22e05c@huawei.com> <A5F62B47-DBA3-457D-89CD-D570EA2EA886@strayalpha.com> <eb63d427f4d34e44908ccee2c2d14073@huawei.com> <F158C443-6E73-4FC6-ADCA-6D28EE8F0A30@strayalpha.com> <d1c8a80b387847a3b00566e3dc0768ab@huawei.com> <D87C00F7-2902-48C4-9DCA-E1019EF32CAA@strayalpha.com> <46be60a38c0f4bc08f352dc8ed353c6a@huawei.com> <4E4C25CB-561C-4BF1-B99B-14E26D00009B@strayalpha.com> <4415086a1b734313b383307a27eb3fb2@huawei.com> <1A41F380-5176-4856-B0FE-BCA065FEAB15@strayalpha.com>
In-Reply-To: <1A41F380-5176-4856-B0FE-BCA065FEAB15@strayalpha.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.47.199.240]
Content-Type: multipart/alternative; boundary="_000_d2dffa85fdbc476f95c008a41e65e696huaweicom_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/yyHS-Ar6RV6NpxJUx6h5sMQuR3c>
X-Mailman-Approved-At: Fri, 26 Mar 2021 08:29:04 -0700
Subject: Re: [Int-area] Proxy function for PTB messages on the tunnel end
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Internet Area Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-area/>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Mar 2021 09:01:10 -0000

Hi all,
I could not stop the discussion at this point, because I was incomplete in the last message. I have missed to mention one important point.
The difference between how tunnel and end node operate is in 2 things:

1.       The tunnel has an unlimited buffer that is not counted as the restriction. It is assumed always as “big enough” by all vendors.

2.       The incoming packet is not checked against the buffer size. It is checked against the MTU of this virtual/tunnel interface.
As a consequence: PMTUD could correct interface MTU, then the next oversized packet would inform (by PTB) the real traffic source. PMTUD works.

All of the above is broken by draft-ietf-intarea-tunnels:

1.       Buffer is assumed small (1500 minus headers)

2.       The incoming packet is checked against the buffer size, not the MTU of the virtual interface

3.       No feedback is proposed between interface MTU and the buffer size
As consequence: PMTUD could not propagate through such interface (PTB feedback from inside the tunnel would not activate at some point the PTB feedback in the traffic source direction). PMTUD is effectively finished intentionally.
Hence, the need for additional fragmentation for the case when the packet is below buffer size, but above tunnel interface MTU.

The market has lost RFC 2473 functionality: it did propose a PTB proxy to react to the 1st PTB message from inside the tunnel.
But the market has at least a slow reaction: the second oversized packet would inform traffic source (1st would be used to correct MTU on the virtual interface).
I had the message in draft-vasilenko-v6ops-ipv6-oversized-analysis: let’s return to the original functionality proposed by RFC 2473.

draft-ietf-intarea-tunnels is the movement in opposite direction: it proposes to completely scrape PMTUD for the environment with tunnels.

All this story is a good example of when vendors did solve the problem very effective.
It is probably because the problem was simple and vendors were trying to be transparent – not to introduce additional limitations visible for others.
The fact that the problem was not over-standardized and not regulated was the important enabler.
But IETF has the intention to introduce additional limitations:

-          PMTUD would be permanently broken inside the tunnel

-          More traffic should be fragmented

-          It is not specified what to do for many tunnels that do not support fragmentation and do not have buffers at all. By new specification - it should be.

Eduard
From: Joseph Touch [mailto:touch@strayalpha.com]
Sent: Wednesday, March 24, 2021 12:10 AM
To: Vasilenko Eduard <vasilenko.eduard@huawei.com>
Cc: v6ops@ietf.org; int-area <int-area@ietf.org>
Subject: Re: Proxy function for PTB messages on the tunnel end

Hi, Eduard,


On Mar 23, 2021, at 1:52 PM, Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>> wrote:

Hi Joseph,

Currently, vendors have chosen some undisclosed big numbers for the reassembly buffer on the tunnel interface
Or no buffer at all for tunnels that do not support reassembly.
That does not create any additional restriction for MTU.

Actually, it does.


Nobody did believe (IPv4 or IPv6 – does not matter) that buffer requirements for end nodes are applicable for transit nodes.

Your assumptions are not correct; this is widely understood by the vendors I have spoken with, some of whom helped write the specs on this issue I cited.


Your liberty to apply the requirements and terminology of one to the other is not a good idea.
draft-ietf-intarea-tunnels propose to decrease reassembly buffer to “typical host EMTU_R (1500B) minus tunnel outer headers overhead” that would cause additional fragmentation.

I’m decreasing the EMTU_R of the tunnel egress by the encapsulation header - that’s just how the effective tunnel EMTU_R is computed. I don’t WANT it to start at 1500 - that’s the value we start with unless we have more information about a specific implementation (it’s the minimum required). If a vendor knows it is larger, then use that, of course.


As the compromise:
Could you change the default for “draft-ietf-intarea-tunnels Tunnel MTU” (reassembly buffer) to 9k? (to reflect reality)

Draft-tunnels does not set a reassembly buffer size. It *CITES* values defined by the encapsulation protocols discussed (IPv4, IPv6 in particular).

If you want 9K, you should use ATM or SONET tunnels (that already have that size) or submit a request to update IPv6 through RFC8200.

I would still be not happy in the mail to any alias about calling parameters of transit node buffer by the terminology of end node buffer.

Tunnels are what they are.


But if you would not create additional fragmentation – I would not have any complains in my draft in regards to draft-ietf-intarea-tunnels.
It could be the resolution.

Draft-tunnels does not WANT to create additional fragmentation - it’s describing what already happens.

Now, I realize you want PTB inside the tunnel to be relayed outside the tunnel. I have already explained why that is incorrect. Beyond being simply wrong, it also would effectively break current tunnels because ICMPs are often blocked.

So instead of fragmenting, you’d end up with a black hole. You need to address that FIRST.


Well, probably it is not a good compromise, because you have the logic through all your document. 9k (reality) would protrude out of your logic.

Nothing in my doc prevents endpoints from using their ACTUAL EMTU_R. I am just saying we cannot assume anything larger than 1500 without knowing otherwise. We also currently have no protocol to discover this automatically. But that’s not saying we should not have such a protocol. We can and should. It would be part of the tunnel configuration.


The logic itself is good. It is broken because the most basic assumption is wrong (before you did apply any logic).

The Data Plane on transit nodes should not behave
as the Control Plane on transit nodes or Transport Layer on end nodes!
It was the wrong assumption initially. Buffers should be different. Names should be different. Unification here is not possible.

You are simply and completely incorrect.

I don’t see a point in further trying to explain why. The approach in draft-tunnels works. The approach has been shown to support tunneling over 15 levels deep (tunnels in tunnels in tunnels…) - including using trace route, etc.

Joe


It would be rejected by vendors anyway because reassembly is expensive, the one who would increase it – would get a competitive disadvantage.
It is easy to translate additional reassembly to $$ losses.

Eduard
From: Joseph Touch [mailto:touch@strayalpha.com]
Sent: Tuesday, March 23, 2021 10:44 PM
To: Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>>
Cc: v6ops@ietf.org<mailto:v6ops@ietf.org>; int-area <int-area@ietf.org<mailto:int-area@ietf.org>>
Subject: Re: Proxy function for PTB messages on the tunnel end





On Mar 23, 2021, at 12:10 PM, Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>> wrote:

Hi Joseph,
I am not much interested to discuss IPv4 now. (despite that 2 MTUs for one interface is absent there too)
Let’s look at your reference to RFC 8200.

Section 4.5: unlike IPv4, fragmentation in IPv6 is performed only by source nodes, not by routers along a packet's delivery path
It means that all these discussions about fragmentation and reassembly are not related to transit nodes. It is for the “source and destination nodes”.

Agreed.



The better terminology is “transit node”, “destination node” – like it is in RFC 8200, not “host” or “router”.

Please see section 2 of RFC8200 (color added by me):

2<https://tools.ietf.org/html/rfc8200#section-2>.  Terminology





   node         a device that implements IPv6.



   router       a node that forwards IPv6 packets not explicitly

                addressed to itself.  (See Note below.)



   host         any node that is not a router.  (See Note below.)

The term “transit node” does no appear in RFC8200.

The terms “source node” and “destination node” are used in RFC8200 but not defined in Sec 2. They are clearly hosts that originate IPv6 packets and hosts that consume IPv6 packets, respectively.

In an IPv6 tunnel, the tunnel ingress emits new packets with IP headers it adds using its IP address. That makes it a source node. Same for how the egress consumes those packets.

From the perspective of the tunnel path, the ingress and egress are hosts and intermediate hop relays are routers.

From the perspective of the overall path, the tunnel is a link, either host/host, host/router, router/host, or router/router. A tunnel is not itself a router, however.


You see – nobody is asking vendors to be compliant with any reassembly buffers in transit. Because it was assumed that would be not reassembly at transit.

Reassembly happens at tunnel egresses whether you want it to or not.



Hence, vendors had the freedom to choose a much bigger number than 1500 when reassembly did happen in reality (despite IPv6 architecture decisions).

1500 is the IPv5 minimum EMTU_R; vendors can always implement larger reassembly when they choose to.



Please, show any evidence (or just claim if you could not disclose) that any vendor has 1500B (or less) for reassembly in the data plane (on transit node).

Here’s how to do it:
            - set interfaces to use 1280B packets
            - setup an IPv6 tunnel
            - send a 1280B packet through that tunnel

If you don’t implement reassembly, it won’t work. But it does. Everywhere.

I neither know nor care. That’s a compliance issue, not a standards issue.
It is not a compliance issue, because there is no regulation/standard to comply with. Vendors had the freedom and solved the problem easily.

RFC8200 is the standard. Tunnel ingresses and egresses create and consume packets, so they act as hosts. I don’t care if they’re implemented on routers; routers implement lots of things as hosts (see e.g., RFC4201, Sec 3.1:


   ...A compliant host

   implementation MUST support (a) and (c) and a compliant security

   gateway must support all three of these forms of connectivity, since

   under certain circumstances a security gateway acts as a host

This is described in detail in:
            RFC1858
            RFC4459
            RFC4944
            RFC6946
            RFC6980
            RFC7588
            RFC8021
            RFC8900
I did not ask for a general discussion. Of course, fragmentation is a big topic with many publications.

You asked for *specific examples* of what vendors do. Those RFCs provide them.



I did ask for any evidence that there is 2 MTU per 1 virtual interface and fragmentation problem as the result of this (when packet would come in between of these MTUs).

I don’t see why you’re stuck on this issue.
Because you are trying to introduce additional fragmentation to the area where it was absent before. The root cause is the introduction of the second MTU per interface (that is in the reality the buffer size).

I have not introduced anything; I am describing an existing requirement of any device that consumes IP packets (i.e., acts as an IP destination). When it does so, it is a host. Tunnel egresses do that.



2 MTUs for one interface is the innovation. It does not exist in any standard or any real implementation. It is invented only in draft-ietf-intarea-tunnels.
It is not just new names and new classification. It new things that does not exist in the real world. Harmful, because of additional fragmentation introduced..

Draft-tunnels has been discussed and reviewed by int-area for over a decade. Nobody else has agreed with your assertions.

Joe