Re: [mpls] [Pals] Mail regarding draft-zzhang-tsvwg-generic-transport-functions

Stewart Bryant <stewart.bryant@gmail.com> Wed, 11 November 2020 12:59 UTC

Return-Path: <stewart.bryant@gmail.com>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 32D053A0A4A; Wed, 11 Nov 2020 04:59:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.997
X-Spam-Level:
X-Spam-Status: No, score=-1.997 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VsaxAeeKxUqQ; Wed, 11 Nov 2020 04:59:34 -0800 (PST)
Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5C9223A10D2; Wed, 11 Nov 2020 04:59:22 -0800 (PST)
Received: by mail-wm1-x330.google.com with SMTP id p22so2179678wmg.3; Wed, 11 Nov 2020 04:59:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=8QeH6RBKb4ROm6XlVYR9lkucGdSI1cnArFobiS9F4YU=; b=T0VAJ5GzIj6Hfo9vmXZs7fk2hiZKiznSE5k8e6ThKjbxng+p1yIlwy/eD17gdc9x/0 acvkSsUXyfN9VrkKLvIZ6YWL8p6LV/ZRcNf6ne5idFMA6WJwdFu1mjlx329ZvR/VcRCj jrOkilb8Za7H+XLmtRmgARFdEOdGKbc6Gu9cV6Isu7bX4clWrq/USuZhcNEsqUx12e4w saEmRESKywP49KHMXa/2Sr1qP//5WhFOsvyZ+EnIENzHqSY3t7GzNXatEzvQYAxme/Zf 4DdIgNBvFVwdzGUk4ntgRljjA8N6ndJaL1ICQvD3gbQaaqDZzA9fGlcJ4Z0AXPjKTIPo 9BeA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=8QeH6RBKb4ROm6XlVYR9lkucGdSI1cnArFobiS9F4YU=; b=IN3Xb0nOQn54zcuthoM7mkm1l3E1DGiAGLb7fKl1gDD2FkmYa8xv6svKQm84GVu8nl mwAto/7z+cCXuGrVBjBiKTF5d+GZHZsHiicWJDvhBfMX6qeEhmyZ7Y+RbJuMOl7VO8p3 6//vXWmNhEUl7UpgwtE/FeA/WXUvxg07ykLAKRVMJj2M+Hi7ZutgGeZEOfj7xHUWA9lM XuOtyOs1FFoxQrszgh+1u9q6vbN480hAzrZMBR7bfHORYtZwQ5tJ+K9LJaQBh9leMw7j gdf0CqC/xFGn5srAfLZbqFad+JqbKC9gXs7iFRdRbpXkJeKQsN9SHLudC5/IVBBMW4/2 X0jw==
X-Gm-Message-State: AOAM531QxjPBjfln18JsixDqQqzBoKM9dRGUePUHM+eRV82hRmyxpktt pGOWRJ/p/7gUdcuEutN/x2o=
X-Google-Smtp-Source: ABdhPJws7Jr9ZN3ER2XYojAwhwjgtSn5czlJY9im92BUgQjCTRQl2fkDdBqUNZQXionDR7HmV0ZrKw==
X-Received: by 2002:a1c:9c0e:: with SMTP id f14mr3973001wme.22.1605099560308; Wed, 11 Nov 2020 04:59:20 -0800 (PST)
Received: from broadband.bt.com ([2a00:23c5:3395:c901:cce7:d7f4:46e1:43ff]) by smtp.gmail.com with ESMTPSA id p13sm2349232wrt.73.2020.11.11.04.59.18 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Nov 2020 04:59:19 -0800 (PST)
From: Stewart Bryant <stewart.bryant@gmail.com>
Message-Id: <BC44F2A9-A337-4ADE-98EB-939248B7406B@gmail.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_4788B3FB-38CA-4B15-B8B8-6A9C4122E262"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
Date: Wed, 11 Nov 2020 12:59:18 +0000
In-Reply-To: <MN2PR05MB5981B357959D01F00A001F84D4E90@MN2PR05MB5981.namprd05.prod.outlook.com>
Cc: Stewart Bryant <stewart.bryant@gmail.com>, "Andrew G. Malis" <agmalis@gmail.com>, "draft-zzhang-tsvwg-generic-transport-functions@ietf.org" <draft-zzhang-tsvwg-generic-transport-functions@ietf.org>, mpls <mpls@ietf.org>, "pals@ietf.org" <pals@ietf.org>
To: "Jeffrey (Zhaohui) Zhang" <zzhang@juniper.net>
References: <4CF6B760-9792-45BD-AC35-31C5C70E2646@gmail.com> <CAA=duU2GXO+gwCGiW_FAn-C4WDA4yS6+Kjg=Ojys6Zics5i06w@mail.gmail.com> <MN2PR05MB5981B357959D01F00A001F84D4E90@MN2PR05MB5981.namprd05.prod.outlook.com>
X-Mailer: Apple Mail (2.3608.120.23.2.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/mpls/tdxk1g0SzB8xcFi6NMpn8fYblj0>
Subject: Re: [mpls] [Pals] Mail regarding draft-zzhang-tsvwg-generic-transport-functions
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Nov 2020 12:59:38 -0000


First some comments on the draft itself.

SB> The draft needs to discuss RFC4623 and explain when this draft would be used in place of RFC4623

   Some functionalities (e.g. fragmentation/reassembly and Encapsulating
   Security Payload) provided by IPv6 can be viewed as independent of
   IPv6 or even IP entirely.  
SB> Specifically IPv6 does not support network fragmentation because it provides PMTUD and devolves the problem to host.
This document proposes to provide those
   functionalities at different layers (e.g., MPLS, BIER or even
   Ethernet) independent of IP.

SB> I think we need to understand why after all these years this is now needed.
SB> Although we have this in PW, I am not sure it is widely used.

1.  Introduction

   Consider an operator providing Ethernet services such as pseudowires,
SB> It is already provided for PWs. Now you can build a case for why you need something better, but you must build the case.

   VPLS or EVPN.  The Ethernet frames that a Provider Edge (PE) device
   receives from a Customer Edge (CE) device may have a larger size than
   the PE-PE path MTU (pMTU) in the provider network.  
SB> I think you need to build a more complete case that the following because correctly implemented IPv6 will not send a frame that is too large because PMTUD and host fragmentation or transport fragmantation will prevent that.  IPv4 supports fragmentation.
SB> So that begs the question of what these packets are?

   This could be
   because

   1.  the provider network is built upon virtual connections (e.g.
       pseudowires) provided by another infrastructure provider, or

   2.  the customer network uses jumbo frames while the provider network
       does not, or

   3.  the provider-side overhead for transporting customers packets
       across the network pushes past the pMTU.

   In any case, the provider simply cannot require its customers to
   change their MTU.
SB> Don’t they change it automatically for most protocols?

   To get those large frames across the provider network, currently the
   only workaround is to encapsulate the frames in IP (with or without
   GRE) and then fragment the IP packets.  
SB> Again I think we need a bigger discussion of what those packets are.

   Even if MPLS is used for
   service delimiting, IP is used for transporation (MPLS over IP/GRE).
   This may not be desirable in certain deployment scenarios, where MPLS
   is the preferred transport or IP encapsulation overhead is deemed
   excessive.

SB> Again, I think we need more detail to justify this.


Zhang, et al.              Expires May 5, 2021                  [Page 2]

Internet-Draft         Generic Transport Functions         November 2020


   IPv6 fragmentation and reassembly are based on the IPv6 Fragmentation
   header below [RFC8200]:

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Next Header  |   Reserved    |      Fragment Offset    |Res|M|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         Identification                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure 1: IPv6 Fragmentation Header

   This document proposes reusing this header in non-IP contexts, since
   the fragmentation/reassembly function is actually independent of IPv6
   except the following aspects:

   o  The fragment header is identified as such by the "previous"
      header.

   o  The "Next Header" value is from the "Internet Protocol Numbers"
      registry.

   o  The "Identification" value is unique in the (source, destination)
      context provided by the IPv6 header
SB> Which of course MPLS does not provide.

   The "Identification" field, in conjunction with the IPv6 source and
   destination identifies fragments of the original packet, for the
   purpose of reassembly.

   Therefore, the fragmentation/reassembly function can be applied at
   other layers as long as a) the fragment header is identified as such;
   and b) the context for packet identification is provided.  Examples
   of such layers include MPLS, BIER, and Ethernet (if IEEE determines
   it is so desired).
SB> Presumable we will liaise this to IEEE?

   For the layers where the IETF is concerned, the "Next Header" value
   will still be from the "Internet Protocol Numbers" registry when the
   function is applied at non-IP layers.
SB> IPv6 has next headers because Frag is an option and that is the way IPv6 works. It is unclear why a transport network fragmantation method as described here would need it.

   For the same consideration, the IP Encapsulating Security Payload
   (ESP) [RFC4303] could also be applied at other layers if ESP is
   desired there.  For example, if for whatever reason the Ethernet
   service provider wants to provide ESP between its PEs, it could do so
   without requiring IP encapsulation if ESP is applied at non-IP
   layers.
SB> I think we should get to the mechanics before discussing the options.

   The possibility of applying some other IP functions (e.g.
   Authentication Header [RFC4302]) is for further study.




Zhang, et al.              Expires May 5, 2021                  [Page 3]

Internet-Draft         Generic Transport Functions         November 2020


2.  Specifications

2.1.  Generic Fragmentation Header

   For generic fragmentation/reassembly functionality independent of IP,
   the following Generic Fragmentation Header (GFH) is defined:

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Next Header  | Header Length |      Fragment Offset    |R|S|M|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         Identification                        |
   |                           (variable)                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                  Figure 2: Generic Fragmentation Header

   The "Next Header", "Fragment Offset" and "M" flag bit fields are as
   in the IPv6 Fragmentation Header.
SB> I am really not sure why a transport network frag needs to worry about NH. It should not care about what it is carrying. Its job is to simply glue the packet back together and let something that understands the glued together packet interpret it.
SB> IN any case NH could, at least in theory, alias IP and cause ECMP problems unless this sits behind a CW, which I don’t think we have discussed yet.

   Header Length:  the number of octets of the entire header.

   R: The "R" flag bit is reserved.  It MUST be 0 on transmitting and
      ignored on receiving.

   Identification:  at least 4-octet long.
SB> Four seems dangerously short given that IPv4 uses Eight. Also we need some discussion about how the ID is generated such that a misdelivered packet cannot get fragmented.
SB> BTW since later in your discussion you talk about multi-path delivery, you need to discuss reassembly timeout. That is not a problem with a PW since it is single path.

   S: If the "S" flag bit is clear, the context for the Identification
      field is provided by the outer header, and only the source-
      identifying information in the outer header is used.  If the "S"
      flag bit is set, the variable Identification field encodes both
      source-identifying information (e.g. the IP address of the node
      adding the GFH) and an identification number unique within that
      source.
SB> We need to be clearer that the ID must be unique per flow for the lifetime of any packet from that flow in the network, including “stuck” packets.

   The outer header MUST identify that a Generic Fragmentation Header
   follows and MAY carry source-identifying information.

   If the outer header is BIER, a TBD value for the "proto" field in the
   BIER header identifies that a GFH follows.  If the "S" flag bit is
   clear, the "BFIR-id" field in the BIER header provides the context
   for the "Identification" field.
SB> It is not yet clear why we need the protocol type at the MPLS layer.

   If the outer header is MPLS, the "S" flag bit MAY be clear if the the
   label preceeding the GFH identifies the sending BFR in addition to
   indicating that a GFH follows (see Section 2.2).


> On 10 Nov 2020, at 23:13, Jeffrey (Zhaohui) Zhang <zzhang@juniper.net> wrote:
> 
> Hi Andy, Stewart,
>  
> If I understand it correctly, RFC4623 is specifically for PWs (p2p) and cannot be used for EVPN/VPLS.

It was certainly intended for use in P2P PWs, I am not sure anyone has thought about its application to EVPN/VPLS.


>  
> The reason is that the sequence number in the control word is specific to the PW and the fragmentation/reassembly is performed in the context of the PW.

That is correct.

> In case of EVPN/VPLS, an egress PE could receive fragments from different ingress PEs and reassembly must be done in the right context. 

We need to be quite clear here, you mean that it could be concurrently reassembling fragments from different ingress PEs to the same egress PE on the same EVPN label. Yes, that is a problem that needs to be addressed and there are a number of ways that it could be addressed. We need t think about whether the FO method is better than the BE method. 

Now the questions is where we start from scratch or provide that identity to RFC4623, and how we provide that identity. One method is SFL, another is to provide identity of the type that you describe as an extended CW. Whatever happens you are going to need a CW to desire that you defeat ECMP inspection of the payload in legacy routers.

>  
> In addition, RFC 7432 (EVPN) specifically calls out that control word is either not used or only with all-0:
>  
>    - If a network uses deep packet inspection for its ECMP, then the
>      "Preferred PW MPLS Control Word" [RFC4385 <https://tools.ietf.org/html/rfc4385>] SHOULD be used with the
>      value 0 (e.g., a 4-octet field with a value of zero) when sending
>      EVPN-encapsulated packets over an MP2P LSP.

So it has a CW.

What method does it use to do OAM, presumable GAL at BOS?

>  
>    - If a network uses entropy labels [RFC6790 <https://tools.ietf.org/html/rfc6790>], then the control word
>      SHOULD NOT be used when sending EVPN-encapsulated packets over an
>      MP2P LSP.

That will only work if you have no legacy nodes on the path. A legacy node is entitled to ignore the EL and do DPI ECMP.

>  
>    - When sending EVPN-encapsulated packets over a P2MP LSP or P2P LSP,
>      then the control word SHOULD NOT be used.

Given the f/b we got in PALS from the operator community about ECMP on PWs without CW, I wonder what is happening on these networks?

>  
> This draft-zzhang allows the context to be determined from the extended “identification” field or from the outer header.

So you have added Identification, and I agree that with unsolicited packets as opposed to the P2P PW case it is needed.

I am not sure about the rest of the design, and in particular I am not sure how you deal with reassembly lockup. That was a problem that could occur in the case of PWs because the reassembly buffer could be considered part of the PW which was something we provisioned. If we had a corruption in the network, then it would all work itself out. With an arbitrary fan in and no preprovisioning, I am not sure what the error behaviour will be. This is something that you certainly need to discuss in the text.

> In addition, it is “generic” such that it can be used for any situations where fragmentation is needed at any layer for any solution.

I am not convinced that the situations are real. I think there needs to be some more context on the deployments. Should be just fine.

BTW I note that this is a TSVWG draft. I am not sure it belongs there. I would have though that it belonged in the PALS/MPLS WGs or at least somewhere in RTG which is where EVPN was developed.

Hopefully the chairs have allowed enough time to discuss next Friday.

Best regards

Stewart

>  
> Thanks.
>  
> Jeffrey
>  
> From: Andrew G. Malis <agmalis@gmail.com <mailto:agmalis@gmail.com>> 
> Sent: Tuesday, November 10, 2020 10:33 AM
> To: Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>>
> Cc: draft-zzhang-tsvwg-generic-transport-functions@ietf.org <mailto:draft-zzhang-tsvwg-generic-transport-functions@ietf.org>; mpls <mpls@ietf.org <mailto:mpls@ietf.org>>; pals@ietf.org <mailto:pals@ietf.org>
> Subject: Re: [Pals] Mail regarding draft-zzhang-tsvwg-generic-transport-functions
>  
> [External Email. Be cautious of content]
>  
> Indeed, this is an already-solved problem. 
>  
> Cheers,
> Andy
>  
>  
> On Tue, Nov 10, 2020 at 9:57 AM Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>> wrote:
> Please can I draw the attention of the authors to https://tools.ietf.org/html/rfc4623 <https://urldefense.com/v3/__https:/tools.ietf.org/html/rfc4623__;!!NEt6yMaO-gk!WnDW7J-i0YOpcV5sF1KDAfZnAaxHY5z-pG0oiSXKdSXdg9o9pRoRlJeq1ENEdsSd$>
>  
> This standards track RFC  specifies how you can sent a fragmented Ethernet frame over a PW in an MPLS network and would seem applicable to the problem that you address in your draft.
>  
> BR 
>  
> Stewart
> _______________________________________________
> Pals mailing list
> Pals@ietf.org <mailto:Pals@ietf.org>
> https://www.ietf.org/mailman/listinfo/pals <https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/pals__;!!NEt6yMaO-gk!WnDW7J-i0YOpcV5sF1KDAfZnAaxHY5z-pG0oiSXKdSXdg9o9pRoRlJeq1KRQDg8M$>
> Juniper Business Use Only
>