[Pce] A further review of draft-koldychev-pce-operational

Adrian Farrel <adrian@olddog.co.uk> Thu, 05 January 2023 21:43 UTC

Return-Path: <adrian@olddog.co.uk>
X-Original-To: pce@ietfa.amsl.com
Delivered-To: pce@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6DC00C1524B5; Thu, 5 Jan 2023 13:43:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.095
X-Spam-Level:
X-Spam-Status: No, score=-7.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=olddog.co.uk
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ywbOl04JNmbk; Thu, 5 Jan 2023 13:43:19 -0800 (PST)
Received: from mta7.iomartmail.com (mta7.iomartmail.com [62.128.193.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0C198C1522BD; Thu, 5 Jan 2023 13:43:17 -0800 (PST)
Received: from vs1.iomartmail.com (vs1.iomartmail.com [10.12.10.121]) by mta7.iomartmail.com (8.14.7/8.14.7) with ESMTP id 305LhFjm019527; Thu, 5 Jan 2023 21:43:15 GMT
Received: from vs1.iomartmail.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 07CF24604B; Thu, 5 Jan 2023 21:43:15 +0000 (GMT)
Received: from vs1.iomartmail.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F01F04603D; Thu, 5 Jan 2023 21:43:14 +0000 (GMT)
Received: from asmtp2.iomartmail.com (unknown [10.12.10.249]) by vs1.iomartmail.com (Postfix) with ESMTPS; Thu, 5 Jan 2023 21:43:14 +0000 (GMT)
Received: from LAPTOPK7AS653V ([148.252.133.87]) (authenticated bits=0) by asmtp2.iomartmail.com (8.14.7/8.14.7) with ESMTP id 305LhCLA027053 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 5 Jan 2023 21:43:13 GMT
Reply-To: adrian@olddog.co.uk
From: Adrian Farrel <adrian@olddog.co.uk>
To: pce@ietf.org
Cc: draft-koldychev-pce-operational@ietf.org
Date: Thu, 05 Jan 2023 21:43:11 -0000
Organization: Old Dog Consulting
Message-ID: <02bc01d9214e$b64a5230$22def690$@olddog.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 16.0
Content-Language: en-gb
Thread-Index: AdkhTeTicDCvTZhWQ3KRRtfsmLrZzQ==
X-Originating-IP: 148.252.133.87
X-Thinkmail-Auth: adrian@olddog.co.uk
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=olddog.co.uk; h=reply-to :from:to:cc:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=20221128; bh=zfOMUJIEVbEMYInPa61fU f/VMPxDBpJ3uMuDXFJ0ezQ=; b=Hp74DZ3HdMY0w8jJHHwvNPs4okw5j33+VF1kw 4VL9pErXmTYwsUiZqfFLToWSLHWb/O2hkmN9IQKvH7eo0CROU9jazIhDdERK598E S4UVymwS5qR7Et6yiVKfyk91PGWlnkpJqfTtVlqynoJu3zr6SnzKYoa0ROVJWRCJ 18w3ekwersg8cTfUAgXAuWsmSwv+wNLVKFceX/MCzHh+wNUMt0w27hEO3NuUD887 YXriqrCeFSV2AiPwmZX5ueWN5YhfwKno7jUB2rsYNMPHZXHj8CNUAT7a+27vUpjK ghwYC5FWaHS5YWCdt5X9GOzJbZAcYXJZAYDP3SMw+IPdb5NcA==
X-TM-AS-GCONF: 00
X-TM-AS-Product-Ver: IMSVA-9.1.0.2090-9.0.0.1002-27368.003
X-TM-AS-Result: No--15.850-10.0-31-10
X-imss-scan-details: No--15.850-10.0-31-10
X-TMASE-Version: IMSVA-9.1.0.2090-9.0.1002-27368.003
X-TMASE-Result: 10--15.849700-10.000000
X-TMASE-MatchedRID: XafQxseY2BoOwAmmWH5kBG2+CcjCvpMws6tiqbMe7lowc9ThMH3qV188 KFWF27c+WTTOHRboyY4kuVykVoOXnu407XMquNtmjrVn4cme+w5+S5m2/8VLmv4DDXoaCqk7axU yBv7T5BOsdgf2XxYBiyQByG1jtz2as4oBMl9gI4ZFl9A34VWpsEjmbQR0Nyy8rSPg4ph0OIKtTE 9Qt7SHiLCeYLKM7SvV0+pR+Ea5cIJQD3BO44j0UYOlbll4OMtk1KDIlODIu+Xe6dEbvIyrxekIS I0LhroN1o5UU9ytRbtqMKtt8pxBXMc+qKbgAFCSQ5OaaEmFzZcEx2nnXvzNIz+B/tp8itBTh+to LLKXwREgnrvKKyWSRz9wWKOjaGxyanY2FcUMZuIFxov+3JYvYzPRJAFM8pbhvnhgJqkfm0Cgt4b I+KwqnYLTKzlXD/77PUXWEweYpTget1TxvngNg2YXUrTb0U+YcV3n4J/0zUO4z7YPbaDNFe4oou vxmgWovf66wqL2jFXTPUrRiLNXZ/jMs1Zm8Wh9RkQWWFH99Hu/KQGKQH1rQOOBUGZYyE2/OfmZm 9hs4sYxT8A6iQVTxttgvQIhlNdIt00CE9Ye0+7UGdB8pbpdMs7+ztu7aOuRBr7dUnIrjPbhkjv7 8Hej4YPigrB24ssSF68ycsQ3IOlg/HvJ1232MvSG/+sPtZVkB8N4SAwGIo7xe7/rXRZkI3nvqbL kjGO6ybLYOBIMYXqfw4MW8XS54curKomfYUvG5zjeoOl0kNEgdsmkXh2CnEWazpgSzbcP7iatWW sJ4ks9ilHqjWi9tfhaO+kOOefTPCTVv6eFAP+eAiCmPx4NwLTrdaH1ZWqCii7lXaIcF/Ww7M6dy uYKg46HM5rqDwqtlExlQIQeRG0=
X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0
Archived-At: <https://mailarchive.ietf.org/arch/msg/pce/GZTrS5y-6nD1U7lE4h5SWKZrwgE>
Subject: [Pce] A further review of draft-koldychev-pce-operational
X-BeenThere: pce@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Path Computation Element <pce.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pce>, <mailto:pce-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/pce/>
List-Post: <mailto:pce@ietf.org>
List-Help: <mailto:pce-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pce>, <mailto:pce-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Jan 2023 21:43:23 -0000

Hi,

This is another fly-by review as I just saw the new revision of the 
draft pop up. I think it is important and helpful that implementers of
IETF protocol work get together to document their experiences with the
technology, so thanks to the authors for their work.

However, I am concerned when implementers try to require other 
implementers to not make the same internal design mistakes that their
friends may have made. We do not need to tell people how to write code,
only how to ensure interoperability.

This document seems to be a mixture of changes to the specifications
(not all of which I agree are the right way of doing things), 
observations on how you might implement internal data structures or
stores, and observations on how protocol exchanges work. I wonder how
much of that needs to be in an IETF Standards Track document.

I'm going to respond separately on
https://mailarchive.ietf.org/arch/msg/pce/skstm9VsYiHOpjlQkQVBuBxE1xg/
because there seems to be some overlap with my previous concern and the
discussion there.

More detailed comments below.

Thanks for your work implementing and testing PCE.

Best,
Adrian

== Significant ==

3.1

   LSP-DB contains two types of objects: LSPs and Tunnels.  An LSP is
   identified by the LSP-IDENTIFIERS TLV.  A Tunnel is identified by the
   PLSP-ID in the LSP object and/or the SYMBOLIC-NAME.  See [RFC8231].

While I can appreciate that there is a need to correlate between LSPs
in the LSP-DB to know whether they are associated and to determine the
many different types of association (where sharing PLSP-ID is one such
association), I baulk at the idea that the LSP-DB contains Tunnels. Why
would it contain tunnels?

But more to the point, what is this section actually telling a PCE 
implementation? Is there some requirement that PCEs hold specific format
information?

---

3.2

   Both PCE and PCC maintain their separate copies of the LSP-DB.  The
   PCE LSP-DB is only modified by PCRpt messages, no other PCEP message
   may modify the PCE LSP-DB.  The PCC LSP-DB is built from actual
   forwarding state that PCC has installed.  PCC uses PCRpt messages to
   synchronize its local LSP-DB to the PCE.

Why must the PCC maintain a copy of the LSP-DB? Especially after the 
PCRpt has been sent? I mean, it may do (especially if the PCC is the 
head end of a soft state protocol, or if the PCC is responsible for
imposing the path information on every packet), but why is this a PCE
requirement?

---

3.2

   The PCE MUST always act on the latest state of the PCE LSP DB.

Why "MUST"?
- Is this a quote from somewhere or are you updating a specification?
- Why is the PCE not at liberty to perform "crooked" path computations?
- In what way is this an interop issue?

---

3.2

   The LSP-DB on both the PCC and the PCE only stores the actual state
   in the network, it does not store the desired state.

I agree that the PCE's LSP DB needs to store the actual state in the
network. Why, however, do you mandate that it cannot also store the
desired state? 

In the case of multiple simultaneous transitions of related LSPs it
could be very useful to know both the current and intended states.

And anyway, why is this an interop issue?

---

3.3.1

While I can accept that the change you describe reflects what has been
implemented (and it is very important to document when we have multiple
interoperable implementations), it seems to me that you could equally 
have used "PCReq with delegation" whereby a PCC would ask for a
computation and delegate in one go. This would make the PCRpt always
say what is in the network, and the PCReq always request an explicit
action from the PCE. (I suppose I'm a little grumpy that you have "Hey,
this is what we built, so standardise it" approach to consensus.)

I would note that under previous definitions, a PCE receiving a PCRpt
with delegation is not obliged to actually do anything! It can, of
course, issue an update in a new PCUpd whenever it wants, but it is not
obliged to unless it thinks it is the right time. This, I think, is what
lies behind the text you quote from 8231. You are changing this by 
saying that a PCRpt with no ERO or an empty ERO has an explicit meaning
equivalent to explicitly requesting PCE action.

Before closing on this (and acknowledging that I prefer my proposal :-)
I'd like you to think about whether a PCRpt with an empty ERO might 
sometimes be totally valid in the <intended-path> of a PCRpt. For 
example, if a PCC (for whatever reason) launched a path into the network
without an ERO (say it's using RSVP-TE) and learns the installed path
from the RRO, then wouldn't the subsequent delegation look like "empty
ERO with full RRO" (i.e., empty <intended-path> with full
<actual-path>)? In that case, are you actually asking the PCE to 
immediately compute a path and issue PCUpd? Maybe you are. Maybe you 
want the PCE to pin the path that has previously be soft-computed in the
network. But will it pick that one (looking at the RRO) or will it look
at the empty ERO and make its own computation?

Maybe, reading further through the section, you intend the Oper-Down 
flag to have more significance.

I note here that your example shows an LSP getting into the PCE's LSP-DB
in a way that shows its presence without its path. This is not entirely 
contradictory to your previous statement about not showing intended
paths, but it is ambiguous with your statement that the LSP-DB only 
stores the actual state.

---

What is 3.4 adding to the discussion?

---

I'd also ask what 3.5 adds.
But I note that 3.5 has a stage in it (old LSP up, new LSP down) that
doesn't appear in 3.4. That makes the two cases seem very different
when, in fact, they only differ in the last steps:
- new goes up
  old is removed
- new is removed



== Minor ==

Title

The title claims "Operational Clarification" and I see where you are
coming from. But the Abstract reports this as clarifications to the
protocol based on interop, and that suggests "Implementation 
Clarifications".

It would be good to get this clear and aligned.

---

Abstract

   This document updates, simplifies and clarifies certain aspects of
   the PCEP protocol.

The use of "updates" is a trigger! The implication is that you are
proposing updates to existing RFCs. You are totally allowed to do that,
but you need to set them out both in the metadata, and in the Abstract
and Introduction.

It looks like you mean to at least update 8231.

---

Section 3

   Alternatively, we could rename LSP to "Instance".

Well, you don't appear to have followed this route. Except for in 3.1 
where you say an LSP is an instance of a tunnel.

It's true. So much of the PCE work (even 7399) pre-dates segment 
routing. But you could take the approach of 8664, or more precisely,
you could borrow from draft-ietf-pce-segment-routing-ipv6 which has:

   Further, note that the term LSP used in the PCEP specifications,
   would be equivalent to a SRv6 Path (represented as a list of SRv6
   segments) in the context of supporting SRv6 in PCEP.

Then you can move on, and not get tied in knots.

---

What is 3.4 adding to the discussion?

Does it in any way affect interoperability?

---

Section 4 starts with:

   PCEP Association is a group of zero or more LSPs.

What exactly is "an association of zero LSPs". I can understand why
you might have an association group with just one LSP while you are in
the process of adding more LSPs. But I can't see what it means to have
an association of zero LSPs.

Obviously, an implementation may decide to allocate association IDs for
particular purposes, but not have populated them with LSPs. But that is
an implementation thing and nothing to do with the protocol. Indeed,
there is no way to communicate an association ID in the absence of a 
PCEP message that communicates about an LSP.

---

AS idnits points out, Section 4.1 has "NOT REQUIRED" which is not a
BCP 14 phrase. You could...

OLD
   PCC updates the first LSP, the PCC is NOT REQUIRED to send the
   ASSOCIATION object in this PCRpt, since the LSP is already in the
   Association.
NEW
   PCC updates the first LSP, it is OPTIONAL for the PCC to send the
   ASSOCIATION object in this PCRpt, since the LSP is already in the
   Association.
END

However, be careful! Are you defining new behavior here or quoting an 
existing behavior? If it's new then you need to be clear what you're
updating. If it is neither (i.e. a clarification of what is already
written, or even just a restatement), then maybe you don't need BCP 14
language.

Personally, I find 6.3.1 of 8697 pretty clear:
   When an LSP is first reported to the PCE, the PCRpt message MUST
   include all the association groups that it belongs to.  Any
   subsequent PCRpt message SHOULD include only the associations that
   are being modified or removed.
Why was there any need to comment on this in your document?

---

At the end of 4.1 you have...

   PCC decides to remove the first LSP from the Association, but not
   delete the LSP itself.  PCC sends PCRpt(R-FLAG=0, PLSP-ID=100, LSP-
   ID=1, ASSO_PARAM=A, ASSO_R_FLAG=1).  The PCE ASSO DB is now empty.

     +---------------------------------------------------------------+
     | ASSO            | LSP                                         |
     +-----------------+---------------------------------------------+
     | ASSO_PARAM=A    |                                             |
     +---------------------------------------------------------------+

                     Figure 13: Content of PCE ASSO DB

But what you've shown in Figure 13 is not an empty ASSO DB. It is an
ASSO DB with an Association in it, where that Association contains no
LSPs.

---

Section 4.2 begins with

   Below, we give an example to illustrate how a Tunnel goes through MBB
   and switches from Association A to Association B.

But where does it say that tunnels are associated? 8697 is quite clear
that associations are of LSPs (indeed, it gives the example of 
associating multiple LSPs from the same tunnel). And your section 4
begins with "PCEP Association is a group of zero or more LSPs."
(Although, refer back to my comment on that point!)

Having read the whole of 4.2, I wonder what it adds to interop or
clarification of the specs.

---

5.

   For any PCEP object that does not have an explicit removal flag, the
   absence of that object indicates removal of the constraint specified
   by that object.

For clarity, you mean...

   For any PCEP object that specifies a path computation constraint and 
   that does not have a defined explicit removal flag, the absence of 
   that entire object on a repeat or follow-up message indicates removal 
   of the constraint previously specified by that object.

In the text that follows, you might replace "state-report" with the 
actual PCEP message you are referring to.

However, isn't this a bit confusing wrt delegation? You appear to be 
saying that a PCC that reports a set of computation constraints on a
PCRpt could change them on a later PCRpt. In the case where there is no
delegation, the computation constraints are not of any great value to 
the PCE and, while it might record them in an LSP-DB, I don't see what 
it would use them for. In the case where the first PCRpt delegates the
LSP, the second appears to be trying to re-assume control and change
the parameters without revoking delegation.

So, what I think you are really saying is that "As well as reporting
any state change in the network on a PCRpt message, a PCC may also 
change the parameters of a delegated LSP. For example, it may remove or
modify the computation constraints that it wishes the PCE to apply as it
computes any updated paths in the future. For any PCEP object that
specifies a path computation constraint and that does not have a defined
explicit removal flag, the absence of that entire object on a repeat or
follow-up message indicates removal of the constraint previously
specified by that object."

I'd agree with you that 8231 is not crystal clear on that point,
although it seemed a bit obvious to me. Making a clarifying statement
could be helpful in an explicit update to that RFC.

---

Section 6 has:

   the values of an
   SR-ERO/SRv6-ERO and SR-RRO/SRv6-RRO (respectively) are in practice
   the same

I know why you say that, but you should explain in the text with
"because there is no feedback mechanism in the forwarding or data plane
that can collect and report the recorded route of a Segment Routing path
when there were loose hops in the explicit route specified."

That certainly seems to be where you are going with the final paragraph 
of that section.

   The following applies to SR-TE only.  If both ERO and RRO are present
   for the same LSP, it SHOULD be interpreted as the RRO being the
   actual path the LSP is taking but MAY interpret only the ERO as the
   actual path.  In the absence of RRO a PCE MUST interpret the ERO as
   the actual path for the LSP.  Until SR-TE introduces some form of
   signaling similar to RSVP-TE, the use of RRO is discouraged for SR-TE
   LSPs.

Of course, you only need to make this clarification for SR-MPLS-TE
because [I-D.ietf-pce-segment-routing-ipv6] is a work in progress and 
you can make the text of that document precise.

That leaves us with:
- What about management tools and OAM that can report a more precise 
  path than in the original ERO?
- What does "the use of RRO is discouraged" mean?


== Nits == 

You might move the Requirements Language away from the document header
and place it as Section 2.1. The RFC Editor would likely do this anyway, 
but you can tidy the document yourselves.

---

Throughout you have "PCEP protocol" which perhaps could be abbreviated
as PCEPP so that you could later write "PCEPP protocol" :-)

---

Section 2 is fine, but you are not defining these terms here, I think.
Thus you should include citations for each term.

For example, PCC and PCE are clearly lifted from 4655.

---

Section 2 is missing "LSP-DB"

---

Not sure whether the term "ERO object" etc. is an over-statement.
I recall having this all the time as we wrote 5440, and we solved it by
either expanding when we wanted to use the word object, or sticking
with the simple abbreviation.