Re: [v6ops] Proxy function for PTB messages on the tunnel end

Vasilenko Eduard <vasilenko.eduard@huawei.com> Mon, 22 March 2021 20:33 UTC

Return-Path: <vasilenko.eduard@huawei.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F22C93A0EEF; Mon, 22 Mar 2021 13:33:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Level:
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lp1AjMCenlHw; Mon, 22 Mar 2021 13:33:18 -0700 (PDT)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 643A03A0EEC; Mon, 22 Mar 2021 13:33:17 -0700 (PDT)
Received: from fraeml702-chm.china.huawei.com (unknown [172.18.147.201]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4F45g22Qzcz682Wv; Tue, 23 Mar 2021 04:28:30 +0800 (CST)
Received: from msceml702-chm.china.huawei.com (10.219.141.160) by fraeml702-chm.china.huawei.com (10.206.15.51) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2106.2; Mon, 22 Mar 2021 21:33:12 +0100
Received: from msceml703-chm.china.huawei.com (10.219.141.161) by msceml702-chm.china.huawei.com (10.219.141.160) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Mon, 22 Mar 2021 23:33:11 +0300
Received: from msceml703-chm.china.huawei.com ([10.219.141.161]) by msceml703-chm.china.huawei.com ([10.219.141.161]) with mapi id 15.01.2106.013; Mon, 22 Mar 2021 23:33:11 +0300
From: Vasilenko Eduard <vasilenko.eduard@huawei.com>
To: Joseph Touch <touch@strayalpha.com>
CC: "v6ops@ietf.org" <v6ops@ietf.org>, int-area <int-area@ietf.org>
Thread-Topic: Proxy function for PTB messages on the tunnel end
Thread-Index: AdcfDpZejD7P5RAGQ06oVS2C5lk8jAACE+sAAAi5WeD//90TAP//s51QgAByGQD//8LNQA==
Date: Mon, 22 Mar 2021 20:33:11 +0000
Message-ID: <22bb7bf129694ccfbbad441d8d22e05c@huawei.com>
References: <0b61deabe8f3420eba1b5794b024e914@huawei.com> <A063E98C-0D6C-49B2-B871-E2B39A097FD5@strayalpha.com> <37059faadd6e441cb98f6ec7e01ecef9@huawei.com> <9D23C833-46C5-4B93-A204-D2D4F54689DF@strayalpha.com> <1e6ecd3b468d4255bda65d519190135d@huawei.com> <3B48413C-A47D-4F3F-B9E4-7ED4D33AA66B@strayalpha.com>
In-Reply-To: <3B48413C-A47D-4F3F-B9E4-7ED4D33AA66B@strayalpha.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.47.199.158]
Content-Type: multipart/alternative; boundary="_000_22bb7bf129694ccfbbad441d8d22e05chuaweicom_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/WHRe8J2kfb0-zQSqqmKzeVZ7Aqs>
Subject: Re: [v6ops] Proxy function for PTB messages on the tunnel end
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Mar 2021 20:33:23 -0000

Hi Joseph,
You insist that the second MTU restriction exists for all data plane implementations, just it is not visible and not discussed: neither in any RFC nor in any vendor documentation.
But then should be the same effect: fragmentation between MTUs.
I am sure that it is not the case.
PTB would decrease the only MTU that tunnel has,
The next oversized packet would get PTB message in the direction of the host.

The buffer is not relevant for the discussion because data plane firmware just does not have a second MTU. This restriction just does not exist in the real life on the planet Earth.
Eduard
From: Joseph Touch [mailto:touch@strayalpha.com]
Sent: Monday, March 22, 2021 10:49 PM
To: Vasilenko Eduard <vasilenko.eduard@huawei.com>
Cc: v6ops@ietf.org; int-area <int-area@ietf.org>
Subject: Re: Proxy function for PTB messages on the tunnel end




On Mar 22, 2021, at 12:32 PM, Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>> wrote:

Hi Joseph,
I believe that the second MTU parameter for the tunnel interface has been created exclusively in the draft-ietf-intarea-tunnels. It did not exist before.

Tunnels have always had both the MTU that isn’t defragmented between in ingress and egress and an EMTU_R at the egress.

In some protocols (e.g., that have no ingress source fragmentation), the two are the same. But they’ve always been there.


Router has to deal with EMTU_R only in the control plane (where LINUX is opening TCP connections).

Hosts deal with EMTU_R; routers do not (unless acting as a host, i.e., as an IP source/sink).


I did not search specifically, but I’ve read before tunneling specifications – the data plane operates only with one tunneling MTU parameter. It was enough for the last 30 years.

IP tunes only the path MTU; it relies on the transport protocols to tune EMTU_R (and not all transports do).

That doesn’t mean EMTU_R didn’t exist or isn’t relevant. It’s just *ignored*.

Please, show me any tunneling specification (GRE? VxLAN? L2TPv3? whatever) where discussion exists about 2 MTUs.

Most don’t - again, that’s an error, not a feature.

Most tunneling systems incorrectly confuse the path MTU and EMTU_R of the tunnel as either equal (which they sometimes are) or ignore the latter.


Please, show me the URL to the documentation of any vendor who has 2 MTU parameters for the tunnel interface. It is very interesting how they explain the need for the second one.

The point of draft-tunnels is to point out this flaw. Vendors get this wrong, which is WHY TUNNELS BREAK.

I could imagine for the second MTU only as of the buffer size in the particular vendor implementation (probably with fragmentation case).

For IPv6 tunnels, it is *defined in the IP spec* as of RFC2460 as 1500B, as distinct from the path of 1280B


Are you aware of any situation when the vendor did not manage their buffers properly for us to intrude into the situation?

The vendors incorrectly relay PTBs and things break; that’s the situation we all live with today. That’s why you’re trying to make MTUs bigger and I’m trying to get tunnels implemented correctly.


It is probably not the reason anyway if others did manage it properly. They should keep buffer much bigger anyway, because reordering and packets jitter may request a lot of memory.

Having a bigger buffer will do NO good unless the source knows about it *and uses it*.


What to do if one could not push the traffic source to decrease MTU because it is already 1280 (minimum)?
RFC 2473 said in 1998: it is the only valid reason to fragment. Many other RFCs do not care about this corner case at all (still not a lot of complains because 220B is available for all additional headers).

It’s not a corner case; it’s *exactly* why most implementations don’t use 1500B packets either; they drop down to 1400 or so.


I believe it is better to return to RFC 2473 for the case when the packet is already 1280 and could not be smaller.

RFC2473 would fail to support IPv6 over IPv6 if it relays PTBs from inside the tunnel, as it is required to do.

Joe




Eduard
From: Joseph Touch [mailto:touch@strayalpha.com]
Sent: Monday, March 22, 2021 8:34 PM
To: Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>>
Cc: v6ops@ietf.org<mailto:v6ops@ietf.org>; int-area <int-area@ietf.org<mailto:int-area@ietf.org>>
Subject: Re: Proxy function for PTB messages on the tunnel end

Eduard,



On Mar 22, 2021, at 9:54 AM, Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>> wrote:

Hi Joseph,

I believe that draft-ietf-intarea-tunnels is Academic purification for the reason that I do not understand.

It is an attempt to try to explain a complex topic that many have confused using different meanings for the same term. It evolved over nearly a decade to the terms inside and was discussed *extensively* in int-area.

Many new names disturb a lot – I have spent more time for draft-ietf-intarea-tunnels than for any other to understand. This draft looks for me as over-engineering.

I encourage you to examine the extensive discussion in int-area in the mail archives.



Moreover, in my draft, I believe that the paragraph about draft-ietf-intarea-tunnels is very difficult to understand. But I have not found how to make it simple because I need to somehow reflect the tremendous complexity of draft-ietf-intarea-tunnels. May be would find later how to explain it shorter – I am not happy about this paragraph.


You are not just explaining something to somebody. You are trying to change. It should have the reason - some motivation.
What benefit anybody would have if every virtual link would have 2 MTUs? Especially if the biggest one is not managed automatically.

I didn’t create that situation - it already exists.

The issue is that ICMP tells you when a path MTU changes, but there is no ICMP that tells you about EMTU_R. Nor is there an ICMP that says “hey, you CAN send 1500 bytes but if you send smaller it’d be better”. ICMP PTB (in IPv6, and the corresponding one for IPv4) tell you when a packet CANNOT cross a path.

A tunnel that must source-fragment to support required IPv6 MTUs (e.g.,IPv6-in-IPv6 over a 1280B path) CAN send packets up to 1500B across a tunnel. It is an error for that tunnel to relay a PTB message that says “cannot support 1280B” just because it has to be fragmented.

You are confusing PTB with “packet bigger than I’d like, but will still get there”.

We have no ICMP message for that. And creating new ICMP messages is a waste of time, given how they’re already filtered extensively.

So what is your solution? I agree that making MTUs bigger would help, but they never get around this problem.

Joe




Eduard
From: Joseph Touch [mailto:touch@strayalpha.com]
Sent: Monday, March 22, 2021 6:29 PM
To: Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>>
Cc: v6ops@ietf.org<mailto:v6ops@ietf.org>; int-area <int-area@ietf.org<mailto:int-area@ietf.org>>
Subject: Re: Proxy function for PTB messages on the tunnel end

Hi, Eduard,

On Mar 22, 2021, at 4:28 AM, Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>> wrote:

Hi Joseph,
I probably need to tell why I like the initial RFC 2473: it requests PTB “proxy” functionality from tunnel ends. (well, it is not called “proxy” inside RFC 2473 – it just discusses how to re-create PTB from PTB on the tunnel end). This initial architecture decision of IPv6 (to inform the real traffic source) is the basement for PMTU to work. It is better to fix it, not to invent some other patches for MTU discovery.

You should review the discussions on why PMTUD does not work in the Internet. That’s why we have PLPMTUD.

So I’m not clear that fixing PMTUD is worth any effort at all.

However...




Longer explanation:
If PTB message would be created on the tunnel path – it would easily inform “tunnel end” of Oversized packet – tunnel MTU could be decreased.

Path MTU ICMP errors indicate when a packet CANNOT TRAVERSE A LINK.

As a *link*, a tunnel’s MTU is its EMTU_R. It cannot be its EMTU_S or path MTU. If it were, then there could never be IPv6-in-IPv6 because no IPv6-in-IPv6 tunnel can relay internal segments of 1280B without ingress source fragmentation.

Is that what you want?




It is already a solution because the next Oversized packet from the source would get PTB response from the tunnel end itself – the source would get PTB after the second oversized packet.
But it was better to inform the real traffic source after 1st Oversized packet. Hence, PTB proxy on the tunnel end is better.

I am not ready to discuss all corrections of draft-ietf-intarea-tunnels section 5.2 to RFC 2473 – they do not have relationships to PMTUD.
Except one that would lead to the massive fragmentation that I would discuss in the next email.
Eduard
From: Joseph Touch [mailto:touch@strayalpha.com]
Sent: Sunday, March 21, 2021 7:23 PM
To: Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>>
Cc: v6ops@ietf.org<mailto:v6ops@ietf.org>; int-area <int-area@ietf.org<mailto:int-area@ietf.org>>
Subject: Re: [v6ops] draft-vasilenko-v6ops-ipv6-oversized-analysis-00

Hi, all,

Spoiler alert if you don’t want to read the whole post:
            - draft-vasilenko makes erroneous claims as to the content in draft-tunnels
            - draft-tunnels and draft-vasilenko are consistent (once the latter is corrected) in their mutual conclusions
                        - draft-tunnels on the need for fragmentation over finite MTU paths
                        - draft-vasilenko in encouraging increases in those finite MTUs

Joe

---

First, draft-ietf-intarea-tunnels is discussed on the int-area list; after review of the information below, if you still believe there are issues to be addressed in that doc, you should post them there.

The technical errors in RFC2473 have been indicated in that document since draft-ietf-intarea-tunnels-01, posted in July 2015. They remain accurate, IMO.

Note that I ceased performing in-place updates of that document because of *lack of active discussion* and because in-place updates are a waste of my time.

I am glad to see someone in IPv6 interested now, and would be glad to update my draft as needed.

FWIW, having read your doc, here are its errors in misstating the content of my draft:

- your doc mistakenly assumes that mine requires IPv6 hosts to send 1500B packets if they can, even if tunnels are on the path
            as with any IPv6 path, the source should send fragments no larger than the entire path can transit, whose reassembled size is no larger than the receiver can reassemble
            those original fragments are what enter the on-path tunnels, so they should be no larger than the tunnel egress can fragment
            and those original fragments would be encapsulated and then source fragmented by the tunnel according to the same (recursive) policy

- nothing in draft-tunnels assumes ICMP PTB cannot adjust these sizes or that the tunnel cannot use PLPMTUD
            see sec 4.3.1 of v10

- draft-tunnels does not “introduce” a new variable called tunnel MTU; I introduced the terminology, but the concept is as old as tunnels
            I coined that term to refer to the MTU across the tunnel with reassembly at egress (which already exists), as different from the MTU between ingress and egress (which I call tunnel MAP)
            sec 4.2.3 of v10 doesn’t claim this value cannot be set; in explains that PMTUD has no role in discovering its value:

          Note, however, that PMTUD never discovers

          EMTU_R that is larger than the required minimum; that information is

          available to some upper layer protocols, such as TCP [RFC1122<https://tools.ietf.org/html/rfc1122>], but

          cannot be determined at the IP layer.
            I never said it cannot be discovered
                        it should be (e.g., by a tunnel configuration protocol)
                        note that there are no current protocols that do this, even without tunnels (i.e., discover larger EMTU_R)
                        I can add that point as clarification

- draft-tunnels does not increase IPv6 fragmentation
            please indicate why you believe it would (notably here "a considerable increase in fragmentation is proposed for the reasons of academic purity”)

- draft-tunnels does not claim fragmentation is the only solution to oversize packets
            it addresses how and where to handle tunnels in the presence of packet limits, of which path MTU is only one

- ICMP PTB is not a solution out to the origin source
            that would potentially drop the IPv6 path MTU below 1280, given enough tunnel overhead (or layers thereof), a violation of IPv6
            so yes, in that case, the ONLY solution that preserves IPv6 in the presence of tunnels with that much overhead would be ingress source fragmentation

- sec 3.3 of my doc DOES allow ICMPs to be relayed back to the source
            it merely states that they should be generated when a packet too large to ingress arrives,
            NOT when an internal tunnel ICMP is received by the ingress

            the point is that the origin source sees the ingress as a router on the path,
            so it should get ICMPs from that router only when packets arrive at that router, not when its tunnel fails downstream

            this makes ICMP relay *easier* and more reliable to implement; the ingress gets tunnel ICMPs to learn the tunnel’s effective link MTU,
            then uses that link MTU to send ICMPs back

            yes, this is to allow the tunnel to act as the link *that it is*, but it does not prohibit ICMP info from flowing back to the source

And finally:

- nobody is claiming we shouldn’t increase link MTU
            draft-tunnels would still be relevant, no matter how large the MTU is, for the reasons I state in that doc

One other observation:

- your statistics for fragment drops apply only when the fragment is visible to the IP layer
            there are intermediate layers that hide fragmentation for exactly this reason, e.g., UDP tunnels, GRE, etc.

---





On Mar 21, 2021, at 1:59 AM, Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>> wrote:

Dear Experts,
I have seen many recent activities in IETF related to MTU problems. Well, maybe not so active as some others, but active anyway. Many other active drafts are evaluated in this draft.
I had an idea what is the right way to solve problems in this area, but after the research, it has been found that foundations were discussed in RFC 2473 (Dec 1998). Just people have forgotten about it.
We have discussed it with co-authors and we have decided that it make sense to publish the research because it looks at the problem in a systematic approach.

The one thing that is alarming in this research: draft-ietf-intarea-tunnels is pushing for much more fragmentation for pure Academic reasons. This draft is already referenced by many other documents.
I believe that not many people have spent enough time to understand it's complexity to reveal the truth: the majority of the IPv6 traffic would be fragmented if it would follow draft-ietf-intarea-tunnels.

Thanks to everybody who would spend enough time to produce comments.
Eduard
-----Original Message-----
From: internet-drafts@ietf.org<mailto:internet-drafts@ietf.org> [mailto:internet-drafts@ietf.org]
Sent: Friday, March 19, 2021 11:07 PM
To: Dmitriy Khaustov <dmitriy.khaustov@rt.ru<mailto:dmitriy.khaustov@rt.ru>>; Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>>; Vasilenko Eduard <vasilenko.eduard@huawei.com<mailto:vasilenko.eduard@huawei.com>>; Xipengxiao <xipengxiao@huawei.com<mailto:xipengxiao@huawei.com>>; Xipengxiao <xipengxiao@huawei.com<mailto:xipengxiao@huawei.com>>
Subject: New Version Notification for draft-vasilenko-v6ops-ipv6-oversized-analysis-00.txt


A new version of I-D, draft-vasilenko-v6ops-ipv6-oversized-analysis-00.txt
has been successfully submitted by Eduard Vasilenko and posted to the IETF repository.

Name:             draft-vasilenko-v6ops-ipv6-oversized-analysis
Revision:         00
Title:               IPv6 Oversized Packets Analysis
Document date:          2021-03-19
Group:             Individual Submission
Pages:              19
URL:            https://www.ietf.org/archive/id/draft-vasilenko-v6ops-ipv6-oversized-analysis-00.txt
Status:         https://datatracker.ietf.org/doc/draft-vasilenko-v6ops-ipv6-oversized-analysis/
Htmlized:       https://datatracker.ietf.org/doc/html/draft-vasilenko-v6ops-ipv6-oversized-analysis
Htmlized:       https://tools.ietf.org/html/draft-vasilenko-v6ops-ipv6-oversized-analysis-00


Abstract:
  The IETF has many new initiatives relying on IPv6 Enhanced Headers
  added in transit: SRv6, SFC, BIERv6, iOAM. Additionally, some recent
  developments are overlays (SRv6, VxLAN) over IPv6. It could create
  oversized packets that need to be dealt with. This document analyzes
  available standards for the resolution of oversized packet drops.




Please note that it may take a couple of minutes from the time of submission until the htmlized version and diff are available at tools.ietf.org<http://tools.ietf.org/>.

The IETF Secretariat


_______________________________________________
v6ops mailing list
v6ops@ietf.org<mailto:v6ops@ietf.org>
https://www.ietf.org/mailman/listinfo/v6ops