Re: [Pce] WG Last Call for draft-ietf-pce-segment-routing-ipv6-15

Cheng Li <c.l@huawei.com> Fri, 19 May 2023 10:17 UTC

Return-Path: <c.l@huawei.com>
X-Original-To: pce@ietfa.amsl.com
Delivered-To: pce@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 75690C14CE39 for <pce@ietfa.amsl.com>; Fri, 19 May 2023 03:17:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.196
X-Spam-Level:
X-Spam-Status: No, score=-4.196 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kAW-DpXxTyji for <pce@ietfa.amsl.com>; Fri, 19 May 2023 03:17:38 -0700 (PDT)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F2677C14CEF9 for <pce@ietf.org>; Fri, 19 May 2023 03:17:37 -0700 (PDT)
Received: from lhrpeml100002.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4QN2mG5y11z6D93l for <pce@ietf.org>; Fri, 19 May 2023 18:15:42 +0800 (CST)
Received: from dggpemm500005.china.huawei.com (7.185.36.74) by lhrpeml100002.china.huawei.com (7.191.160.241) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Fri, 19 May 2023 11:17:33 +0100
Received: from dggpemm500003.china.huawei.com (7.185.36.56) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Fri, 19 May 2023 18:17:31 +0800
Received: from dggpemm500003.china.huawei.com ([7.185.36.56]) by dggpemm500003.china.huawei.com ([7.185.36.56]) with mapi id 15.01.2507.023; Fri, 19 May 2023 18:17:31 +0800
From: Cheng Li <c.l@huawei.com>
To: "adrian@olddog.co.uk" <adrian@olddog.co.uk>, "julien.meuric@orange.com" <julien.meuric@orange.com>, "pce@ietf.org" <pce@ietf.org>
Thread-Topic: [Pce] WG Last Call for draft-ietf-pce-segment-routing-ipv6-15
Thread-Index: AQHZP9JEuFtmdfudoESX7FgP8diOxq7X2sQAgAw8WkCACampAIBzw4hw
Date: Fri, 19 May 2023 10:17:31 +0000
Message-ID: <cfd362485caa4921a552c41fb4bed71f@huawei.com>
References: <89861225-cfa6-19e0-c40f-27f3305650ea@orange.com> <0ad301d94570$43fdcae0$cbf960a0$@olddog.co.uk> <a0fdf039fd624d69b86723bbe25f7a53@huawei.com> <043301d95063$460027e0$d20077a0$@olddog.co.uk>
In-Reply-To: <043301d95063$460027e0$d20077a0$@olddog.co.uk>
Accept-Language: zh-CN, en-US
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.50.76.121]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/pce/iDDH0f-Vrzz4fiPqcQceogs22LQ>
Subject: Re: [Pce] WG Last Call for draft-ietf-pce-segment-routing-ipv6-15
X-BeenThere: pce@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Path Computation Element <pce.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pce>, <mailto:pce-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/pce/>
List-Post: <mailto:pce@ietf.org>
List-Help: <mailto:pce-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pce>, <mailto:pce-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 19 May 2023 10:17:43 -0000

Hi Adrian,

Sorry for my delay. Please see my reply inline.

You can find the modifications and diff here: https://github.com/muzixing/IETF-PCEP-SRV6/commit/56834cd1b09eacee0ef48e874b48956438ae1333?diff=split

Thanks,
Cheng



-----Original Message-----
From: Adrian Farrel <adrian@olddog.co.uk> 
Sent: Tuesday, March 7, 2023 3:39 AM
To: Cheng Li <c.l@huawei.com>; julien.meuric@orange.com; pce@ietf.org
Cc: Lizhenbin <lizhenbin@huawei.com>
Subject: RE: [Pce] WG Last Call for draft-ietf-pce-segment-routing-ipv6-15

> Many thanks for your comments, I have accepted most of the comments 
> from you, and would like to discuss with you about the rest. Please 
> see my reply inline.

Great. Thanks, Cheng. Continuing the discussion in line.
Snipped all of the resolved stuff.

> Because we have a lots of comments. It may be good to address them by 
> several updates. So we can close completed comments and work on the 
> remain ones.

Of course.
I'm sure you can be relied on to keep track of what you still have to address.

A

===

4.1.1

   If a PCEP speaker includes PST=3 (early allocated by
   IANA) in the PST List of the PATH-SETUP-TYPE-CAPABILITY TLV then it
   MUST also include the SRv6-PCE-CAPABILITY sub-TLV inside the PATH-
   SETUP-TYPE-CAPABILITY TLV.

Two questions:

   a. What happens if PST=3 is present but SRv6-PCE-CAPABILITY is
      missing?
[Cheng2] Please see https://datatracker.ietf.org/doc/html/draft-ietf-pce-segment-routing-ipv6-16#section-5.1 it will be treated as an error.
"
If a PCEP speaker receives a PATH-SETUP-TYPE-CAPABILITY TLV with a PST list containing PST=3, but the SRv6-PCE-CAPABILITY sub-TLV is absent, then the PCEP speaker MUST send a PCErr message with Error-Type = 10 (Reception of an invalid object) and Error-Value = 34 (Missing PCE-SRv6-CAPABILITY sub-TLV) and MUST then close the PCEP session. If a PCEP speaker receives a PATH-SETUP-TYPE-CAPABILITY TLV with an SRv6-PCE-CAPABILITY sub-TLV, but the PST list does not contain PST=3, then the PCEP speaker MUST ignore the SRv6-PCE-CAPABILITY sub-TLV.
"


   b. What happens if SRv6-PCE-CAPABILITY is present, but PST=3 was not
      included? (You don't say that that is not allowed.)
[Cheng2] See above, the TLV will be ignored.


It's possible that a forward pointer to Section 5 is enough (I didn't cross-check this).

[CL] It seems like we need to specify the details of error handling.

[AF] Right! 3% of the function, 30% of the text ;-)

---

4.3.1  Flags

A pointer to Section 9.2 would be useful.
[CL] May not need to do so. Too many pointers I think.

[AF] Well, OK. I won't insist.

---

4.3.1

What is the interaction between the L flag in the subobject header and the S bit in the Flags field?
[CL] L flag is for loose path. S flag is saying that the SID value is absent. The PCC may allocate a value according to the NAI.

[AF] I see...

   The 'L' Flag: Indicates whether the subobject represents a loose-hop
   (see [RFC3209]).  If this flag is set to zero, a PCC MUST NOT
   overwrite the SID value present in the SRv6-ERO subobject.
   Otherwise, a PCC MAY expand or replace one or more SID values in the
   received SRv6-ERO based on its local policy.

...and...

   *  S: When this bit is set to 1, the SRv6-SID value in the subobject
      body is absent.  In this case, the PCC is responsible for choosing
      the SRv6-SID value, e.g., by looking up in the SR-DB using the NAI
      which, in this case, MUST be present in the subobject.  If the S
      bit is set to 1 then F bit MUST be set to zero.

How can you overwrite the SID value in the SRv6-ERO subobject if it is absent?
[Cheng2]If the ERO subobject is absent, we can not overwrite it. We can overwrite it(Actually, it is not overwrite, because the value is not provided) only when the ERO subobject presents. The PCC will allocate a SID value for this ERO subobject according to the NAI or other info.

---

4.3.1 / 4.3.1.1

Figure 2 shows the "SID Structure" field as 32 bits, but Figure 3 is pretty clear that it is 64 bits. I think Figure 2 is broken.

[AF] You didn't comment, but I think it is an easy fix.
[Cheng2]Fixed, 64 bits.
---

4.4.1

Is the SRv6-SID field optional in the RRO subobject? The figure implies it must be present, and that would seem pretty reasonable for an RRO.
But the text says all the fields are exactly as in the ERO, and the S flag seems to still have meaning.

There is the same problem with the length of the SID structure field.
[Cheng] I think you are right, it must be present. So we will not need S-flag any more. Let's see how to handle this.

[AF] OK. Awaiting dsicussions.
[Cheng2] It is optional. Though when the SID value is absent, it looks weird. But we can keep it as the same as ERO, which means we can unset S-bit if SRv6 SID value is required all the time.
We can add "optional" in the figure to sync up with ERO.   
---


5.1

   If a PCE
   receives an SRv6-PCE-CAPABILITY sub-TLV with the X flag set to 1 then
   it MUST ignore any MSD-Type, MSD-Value fields and MUST assume that
   the sender can impose any length of SRH.

Is that really "MUST assume" or just "may assume"?  That would seem to be consistent with...

   However, if a PCE learns these via different means, e.g
   routing protocols, as specified in:
   [I-D.ietf-lsr-ospfv3-srv6-extensions];
   [I-D.ietf-lsr-isis-srv6-extensions]; [I-D.ietf-idr-bgpls-srv6-ext],
   then it ignores the values in the SRv6-PCE-CAPABILITY sub-TLV.

But there's a hole :-( If we follow the previous paragraph, the PCE may send an ERO with more SIDs than allowed by the MSD in the Open message (if the value learned from the IGP is a larger number. But...

   If a PCEP
   session is established with a non-zero SRv6 MSD value, and the PCC
   receives an SRv6 path containing more SIDs than specified in the SRv6
   MSD value (based on the SRv6 MSD type), the PCC MUST send a PCErr
   message with Error-Type = 10 (Reception of an invalid object) and
   Error-Value = TBD1 (Unsupported number of SRv6-ERO subobjects).

...says that regardless of the MSD in the IGP, the PCC must reject an ERO that contains more SIDs that specified in the Open message.

Furthermore, you have...

   If a PCC receives a list of SRv6 segments, and the number of SRv6
   segments exceeds the SRv6 MSD that the PCC can impose on the packet
   (SRH), it MUST send a PCErr message with Error-Type = 10 ("Reception
   of an invalid object") and Error-Value = TBD5 ("Unsupported number of
   SRv6-ERO subobjects") as per [RFC8664].

This last paragraph seems to cover all bases, so why not just use it?
[Cheng] em, what do you mean just use it? 

[AF] I mean, rely only on that last paragraph and delete all of the other text that seems to be confused or contradictory.
[Cheng2]Well, the previous paragraphs introduce the ways that the MSD value can be learned. The last paragraph state that how the PCC reacts when the number of SIDs exceeds the MSD. I think it is OK.
---

5.1

There seems to be a contradiction about the MSD 0 case.

You have...

   If a PCE receives an SRv6-PCE-CAPABILITY sub-TLV
   with the X flag as set, and SRv6 MSD-Type or MSD-Value fields are set
   to zero then it is considered as an error and the PCE MUST respond
   with a PCErr message (Error-Type = 1 "PCEP session establishment
   failure" and Error-Value = 1 "reception of an invalid Open message or
   a non Open message.").

[Cheng2] let's delete "to zero", because when X is set, no matter MSD-type fields MUST NOT be included, so it will be treated as an error.

...and...

   If a
   PCEP session is established with an SRv6 MSD value of zero, then the
   PCC MAY specify an SRv6 MSD for each path computation request that it
   sends to the PCE, by including a "maximum SID depth" metric object on
   the request similar to [RFC8664].

The first says you can't have a session with MSD 0, the second tells you how to act if you do.
[CL] we need more discussion here.
[AF] OK. Waiting.
[Cheng2] See above, the case is about X =1, then no MSD TLV fields are included. This part talks about when X=0 and MSD=0, then PCE need to get the MSD value, so request the PCC to send the MSD to PCE in path computation request msg.
---

5.1

   The N flag, X flag and (MSD-Type,MSD-Value) pair inside the SRv6-PCE-
   CAPABILITY sub-TLV are meaningful only in the Open message sent from
   a PCC to a PCE.  As such, a PCE MUST set the flags to zero and not
   include any (MSD-Type,MSD-Value) pair in the SRv6-PCE-CAPABILITY sub-
   TLV in an outbound message to a PCC.  Similarly, a PCC MUST ignore
   N,X flag and any (MSD-Type,MSD-Value) pair received from a PCE.  If a
   PCE receives multiple SRv6-PCE-CAPABILITY sub-TLVs in an Open
   message, it processes only the first sub-TLV received.

How does this work when the session is between two PCEs?

[CL]Will we use it for two PCEs?

[AF] Well, PCEs are allowed to talk to each other using PCEP. I guess, in that case, you would model one PCE being client and the other server, but it is possible that the roles get reversed for different path setups.

[AF] I think that, you can cover this by using...

   The N flag, X flag and (MSD-Type,MSD-Value) pair inside the SRv6-PCE-
   CAPABILITY sub-TLV are meaningful only in the Open message sent to a
   PCE.  As such, the flags MUST be set to zero and a (MSD-Type,MSD-Value)
   pair MUST NOT be present in the SRv6-PCE-CAPABILITY sub-TLV in an
   Open message sent to a PCC.  Similarly, a PCC MUST ignore
   N,X flag and any (MSD-Type,MSD-Value) pair in a received Open message.
   If a PCE receives multiple SRv6-PCE-CAPABILITY sub-TLVs in an Open
   message, it processes only the first sub-TLV received.
[Cheng2] Agree.
---

5.2.1

I find the following...

   *  If NT=0, the F bit MUST be 1, the S bit MUST be zero and the
      Length MUST be 24.

So, NT=0, F=1, S=1 is an error.

Then...

   If a PCC finds that the NT field, Length field, S bit, F bit, and T
   bit are not consistent, it MUST consider the entire ERO invalid and
   MUST send a PCErr message with Error-Type = 10 ("Reception of an
   invalid object") and Error-Value = 11 ("Malformed object").

But...

   If a PCC receives an SRv6-ERO subobject in which the S and F bits are
   both set to 1 (that is, both the SID and NAI are absent), it MUST
   consider the entire ERO invalid and send a PCErr message with Error-
   Type = 10 ("Reception of an invalid object") and Error-value = TBD3
   ("Both SID and NAI are absent in the SRv6-ERO subobject").

Two error messages for the same error?
[CL] further discussion.

[AF] OK. Waiting.
[Cheng2]The first paragraph is mainly for all the cases of Malformed objects. The second paragraph is specifically for Both SID and NAI are absent.

---

5.2.1

   If a PCC receives an SRv6-ERO subobject in which the S bit is set to
   1 and the F bit is set to zero (that is, the SID is absent and the
   NAI is present), but the PCC does not support NAI resolution, it MUST
   consider the entire ERO invalid and send a PCErr message with Error-
   Type = 4 ("Not supported object") and Error-value = 4 ("Unsupported
   parameter").

This is good. But it seems harsh for the poor PCE. How was the PCE supposed to know that the PCC didn't support NAI resolution?
[CL] further discussion.

[AF] OK. Waiting.
[Cheng2] Well, this is only for the case that the PCC does not support NAI resolution. It just tells the PCE that it does not support. Then the PCE may send some ERO with SID value then. It is not required for a PCE to know this before sending the ERO.
---

5.2.2

There is nothing wrong with what is written here, but it may be very helpful to PCC implementations if you make it clear that the SID list in an SRH is arranged in the reverse order to the ERO subobject list.
[CL] further discussion.

[AF] OK. Waiting
[Cheng2]yes reverse. But it may be OK here, because the PCC will put the SIDs in the right order into the SRH, it knows which one is the first SID.

---

5.3

Here, too, it would be really helpful to be absolutely clear about the order of subobjects, and to compare this to the order of the SIDs in the SRH.
[CL] further discussion.

[AF] OK. Waiting
[Cheng2]I think the problem comes from SRH not PCEP. Because PCEP tells the PCC the SID list in the right order by ERO, and it is recorded in the RRO in the same way. But it may be inserted in the reverse way in the SRH only because of the encoding rules of SRH.

---
5.3

RFC 8664 makes some wise observations about how a PCC constructs an RRO (how can it know?) and suggests that the mechanisms are out of scope of the RFC.
I suggest you look at that text and write something similar here.
[CL] further discussion.

[AF] OK. Waiting
[Cheng2]

RFC8664 says
"A PCC MUST order the SR-RRO subobjects such that the first subobject relative to the beginning of the RRO identifies the first segment visited by the SR-TE LSP, and the last subobject identifies the final segment of the SR-TE LSP, that is, its endpoint." 

I think this  may be good enough. Any suggestions?

---

6.

No mention of RFC 8253?
[CL] further discussion.

[AF] OK. Waiting
[Cheng2] Since we think no further security risks are introduced, it may be ok to not exclusive list all the related RFCs? It is OK to me to mention this RFC. But it is ok to me to keep the text.
---

7.4

I thought Endpoint Behavior and SID Structure were new to this document and provided the potential for (future) verification?
[CL] further discussion.

[AF] OK. Waiting
[Cheng2]Any suggestions here?

---

9.1

Isn't the RRO in PCEP the "Reported Route Object"?
[CL] I have the same remember with yours, but it is Record Route Object.
Weird.

[AF] I think the Object Class (8) is "Reported Route Object" and the Object Type (1) is "Recorded Route"

https://www.rfc-editor.org/rfc/rfc5440.html#section-7.10
[Cheng2]Agree, fixed
---

9.1

I'm trying to find the document that makes the variation from RFC 5440 for the definition of "PCEP only" ERO and RRO subobjects. I understand why you want to do this, but what you are doing is contrary to RFC 5440.

I see that RFC 8664 dodged this by "forgetting" to mark the SR-ERO and SR-RRO subobjects as "PCEP only". 

I have started a separate thread on this on the mailing list and have copied TEAS because this seems to need some small administrative work to fix it up.
[CL] further discussion. See the discussion in ML, and update accordingly.

[AF] OK. Waiting
[Cheng2]Any update here?
---

9.2

You need to tell IANA how many bits can be allocated. Otherwise they may try to allocate bit 12 to the next request.
[CL] well, I added  '12-bit' in text to tell that explicitly.

[AF] Perfect