[sfc] Rtgdir last call review of draft-ietf-sfc-multi-layer-oam-23

Darren Dukes via Datatracker <noreply@ietf.org> Mon, 08 May 2023 15:39 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: sfc@ietf.org
Delivered-To: sfc@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id E6BDCC15C528; Mon, 8 May 2023 08:39:16 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Darren Dukes via Datatracker <noreply@ietf.org>
To: rtg-dir@ietf.org
Cc: draft-ietf-sfc-multi-layer-oam.all@ietf.org, last-call@ietf.org, sfc@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 10.2.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <168356035693.46762.12658746072601709043@ietfa.amsl.com>
Reply-To: Darren Dukes <ddukes@cisco.com>
Date: Mon, 08 May 2023 08:39:16 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/sfc/xhZ_TQnLp5ebgvnD7R-C0M73p-M>
Subject: [sfc] Rtgdir last call review of draft-ietf-sfc-multi-layer-oam-23
X-BeenThere: sfc@ietf.org
X-Mailman-Version: 2.1.39
List-Id: Network Service Chaining <sfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sfc>, <mailto:sfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sfc/>
List-Post: <mailto:sfc@ietf.org>
List-Help: <mailto:sfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sfc>, <mailto:sfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 08 May 2023 15:39:17 -0000

Reviewer: Darren Dukes
Review result: Has Issues

Below is my review of draft-ietf-sfc-multi-layer-oam as part of the routing
area directorate on behalf of the ADs. I have some high level concerns and a
detailed review that follows.

MAJOR: Fate Sharing is listed as a SHOULD, is it perhaps a MUST? does OAM work
for SFC if it doesn't follow the same chain as traffic?

MAJOR: The document does not list Performance Measurement as a requirement,
though it is mentioned in the introduction and in RFC8924.

MAJOR: The document claims the SFC echo request/reply resolves all 8
requirements without stating precisely how. I don't doubt that it does, but I
would benefit from a clear description of how.

MAJOR: Security Consideration look relatively complete, but the Security
considerations of RFC7665 should be re-evaluated in this specification to
identify areas of additional attack or exposure with echo req/rep and the tools
that may be used within the limited NSH domain, or outside it, to diagnose
failures.

MINOR: The Operational Considerations section seems light.  For an OAM
specification I was expecting more detail on the "Operation" side of this
specification.

NIT: The use of forward references requires a reader/implementer place a lot on
their stack to follow the references.

Below is a more detailed review with the following format:
```
  quoted text
```
** followed by my comments/questions.

```
       This document defines how active Operation, Administration and
       Maintenance (OAM), per [RFC7799] definition of active OAM, is
       identified when Network Service Header (NSH) [RFC8300] is used as the
       SFC encapsulation.
```

** Does this document define how OAM is identified? Perhaps the right word was
implemented?

```
       Active OAM
       tools, conformant to the requirements listed in Section 3, improve,
       for example, troubleshooting efficiency and defect localization in
       SFP because they specifically address the architectural principles of
       NSH.
```
** Should "conformant to the requirements in Section 3" be "conformant to this
specification"? I don't see how tools can conform to the requirements.

** Do the tools address "the architectural principles of NSH"? If so what are
the principles that need addressing and how are they addressed?

```
       Active OAM
       tools, conformant to the requirements listed in Section 3, improve,
       for example, troubleshooting efficiency and defect localization in
       SFP because they specifically address the architectural principles of
       NSH.  For that purpose, SFC Echo Request and Echo Reply are specified
       in Section 6. This mechanism enables on-demand Continuity Check and
       Connectivity Verification among other operations over SFC in networks
       addresses functionalities discussed in Sections 4.1, 4.2, and 4.3 of
       [RFC8924].  SFC Echo Request and Echo Reply, defined in this
       document,
```

** s/This mechanism/These mechanisms/

** s/defined in this document// - It's just been said where they are defined

```
      Following are the requirements for an FM SFC OAM, whether
       with the E2E or segment scope:

          REQ#1: Packets of active SFC OAM SHOULD be fate sharing with the
          monitored SFC data in the forward direction from ingress toward
          egress endpoint(s) of the OAM test.
```

** Since this is SHOULD what is the consequence of not doing this?

```
    1.  Active SFC OAM Header

       As demonstrated in Section 4 [RFC8924] and Section 3 of this
       document, SFC OAM is required to perform multiple tasks.  Several
```

** Requirements are stated a few lines up, no need for a reminder.

** s/As demonstrated in Section 4 [RFC8924] and Section 3 of this document, //

```
       active OAM protocols could be used to address all the requirements.
       When IP/UDP encapsulation of an SFC OAM control message is used,
       protocols can be demultiplexed using the destination UDP port number.
       But extra IP/UDP headers, especially in an IPv6 network, add
       noticeable overhead.  This document defines Active OAM Header
       (Figure 2) to demultiplex active OAM protocols on an SFC.

      0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | V | Msg Type  |     Flags     |          Length               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       ~              SFC Active OAM Control Packet                    ~
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                          Figure 2: SFC Active OAM Header

          V - two-bit-long field indicates the current version of the SFC
          active OAM header.  The current value is 0.  The version number is
          to be incremented whenever a change is made that affects the
          ability of an implementation to parse or process the SFC Active
          OAM header correctly.  For example, if syntactic or semantic
          changes are made to any of the fixed fields.

          Msg Type - six bits long field identifies OAM protocol, e.g., Echo
          Request/Reply.

          Flags - eight bits long field carries bit flags that define
          optional capability and thus processing of the SFC active OAM
          control packet, e.g., optional timestamping.  No flags are defined
          in this document, and therefore, the bit flags MUST be zeroed on
          transmission and ignored on receipt.

          Length - two octets long field that is the length of the SFC
          active OAM control packet in octets.
```

** Consistency in header field descriptions is lacking throughout the doc.

** s/V - two-bit-long field/V - two bit field/

** s/Msg Type - six bits long field/Msg Type - six bit field/

** etc. for all other definitions.

```
    6.  Echo Request/Echo Reply for SFC

       Echo Request/Reply is a well-known active OAM mechanism extensively
       used to verify a path's continuity, detect inconsistencies between a
       state in control and the data planes, and localize defects in the
       data plane.  ICMP ([RFC0792] for IPv4 and [RFC4443] for IPv6
       networks) and [RFC8029] are examples of broadly used active OAM
       protocols based on the Echo Request/Reply principle.  The SFC Echo
       Request/Reply defined in this document conforms to REQ#1 (Section 3)
       by using the NSH encapsulation of the monitored service.  Further,
       the mechanism addresses requirements REQ#2 through REQ#7, listed in
       Section 3.  Specifically, it can be used to check the continuity of
       an SFP, trace an SFP, or localize the failure within an SFP.  Also,
       note that REQ#8 can be addressed by an extension of the SFC Echo
       Request/Reply described in this document adding proxy capability.
       The SFC Echo Request/Reply control message format is presented in
       Figure 3.
```

** How does this echo request/reply "conforms to REQ#1".  Does this mean
"satisfies REQ#1"?

** there is no justification of how REQ#2-7 are satisfied. Some expansion is
needed.

** are back references to section 3 really needed? One presumably just read
section 3 and the requirements.

```
     The interpretation of the fields is as follows:

          Version (V) is a two-bit field that indicates the current version
          of the SFC Echo Request/Reply.  The current value is 0.  The
          version number is to be incremented whenever a change is made that
          affects the ability of an implementation to parse or process the
          control packet correctly.  If a packet presumed to carry an SFC
          Echo Request/Reply is received at an SFF, and the SFF does not
          understand the Version field value, the packet MUST be discarded,
          and the event SHOULD be logged.

          Reserved - fourteen-bit field.  It MUST be zeroed on transmission
          and ignored on receipt.

          The Echo Request Flags is a two-octet bit vector field.  A flag
          defined in the Flags field of the SFC Active OAM header in
          Figure 2 has no implication for those defined in the Echo Request
          Flags field of an Echo Request/Reply message.

          The Message Type is a one-octet field that reflects the packet
          type.  Value 1 identifies Echo Request and 2 - Echo Reply.

          The Reply Mode is a one-octet field.  It defines the type of the
          return path requested by the sender of the Echo Request.

          Return Codes and Subcodes are one-octet fields each.  These can be
          used to inform the sender about the result of processing its
          request.  Return Code values are provided in Table 1.  For all
          Return Code values defined in this document, the value of the
          Return Subcode field MUST be set to zero.
```

** Consistent type descriptions like "(field) - x bit field (description)" is
needed throughout the document

```
    6.3.1.  Source TLV

       The responder to the SFC Echo Request encapsulates the SFC Echo Reply
       message in IP/UDP packet if the Reply mode is "Reply via an IPv4/IPv6
       UDP Packet".  Because the NSH does not identify the ingress node that
       generated the Echo Request, the source ID MUST be included in the
       message and used as the IP destination address and destination UDP
       port number of the SFC Echo Reply.  The sender of the SFC Echo
       Request MUST include an SFC Source TLV (Figure 5).

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |  Source ID  |   Reserved1   |           Length              |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |         Port Number         |           Reserved2           |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |                        IP Address                           |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                              Figure 5: SFC Source TLV

       where

          Source ID Type is a one-octet field and has the value of 1
          Section 10.4.

          Reserved1 - one-octet field.  The field MUST be zeroed on
          transmission and ignored on receipt.

          Length is a two-octet field, and the value equals the length of
          the data following the Length field counted in octets.  The value
          of the Length field can be 8 or 20.  If the value of the field is
          neither, the Source TLV is considered to be malformed.

          Port Number is a two-octet field.  It contains the UDP port number
          of the sender of the SFC OAM control message.  The value of the
          field MUST be used as the destination UDP port number in the IP/
          UDP encapsulation of the SFC Echo Reply message.

          Reserved2 is a two-octet field.  The field MUST be zeroed on
          transmit and ignored on receipt.

          IP Address field contains the IP address of the sender of the SFC
          OAM control message, IPv4 or IPv6.  The value of the field MUST be
          used as the destination IP address in the IP/UDP encapsulation of
          the SFC Echo Reply message.
```

** The TLV format of "Type" "Reserved" "Length" should not be redefined for
each TLV type. The authors should provide values for Type, and Length like...

- Type: Source ID (1)
- Length: The length of the variable length value in octets.
- Port Number: ...
- Reserved2: MBZ
- IP Address: ...

** Why is the IP address type assumed IPv4 or v6 based on length? Why not use 4
bits for the IPversion - there is space in Reserved2.

** Try to avoid processing descriptions within the field definitions of TLVs.
For example, port number "MUST be used as the destination UDP port..." This is
better placed in the echo request/reply processing section.

```
       *  Reply via an IPv4/IPv6 UDP Packet (2).  This likely will be the
          most often used value.
```

** Why is the assumption of how often the reply via UDP is used relevant? 
Again in section 6.5 it's referred to as a 'default value'. Is this default
value relevant, are there requirements that implementations have specific
configurable and default parameters?

```
    6.4.  SFC Echo Request Reception

       Punting a received SFC Echo Request to the control plane is triggered
       by one of the following packet processing exceptions: NSH TTL
       expiration, NSH Service Index (SI) expiration, or the receiver is the
       terminal SFF for an SFP.

       An SFF that received the SFC Echo Request MUST validate the packet as
       follows:

       1.   If the SFC Echo Request is integrity-protected, the receiving
            SFF first MUST verify the authentication.

       2.   Validate the Source TLV, as defined in Section 6.3.1.

       3.   Suppose the authentication validation has failed and the Source
            TLV is considered properly formatted.  In that case, the SFF
            MUST send to the system identified in the Source TLV (see
            Section 6.5), according to a rate-limit control mechanism, an
            SFC Echo Reply with the Return Code set to "Authentication
            failed" and the Subcode set to zero.

       4.   If the Source TLV is determined malformed, the received SFC Echo
            Request processing is stopped, the message is dropped, and the
            event SHOULD be logged, according to a rate-limiting control for
            logging.

       5.   If the authentication is validated successfully, the SFF that
            has received an SFC Echo Request verifies the rest of the
            packet's general sanity.

       6.   If the packet is not well-formed, the receiver SFF SHOULD send
            an SFC Echo Reply with the Return Code set to "Malformed Echo
            Request received" and the Subcode set to zero under the control
            of the rate-limiting mechanism to the system identified in the
            Source TLV (see Section 6.5).

       7.   If there are any TLVs that the SFF does not understand, the SFF
            MUST send an SFC Echo Reply with the Return Code set to 2 ("One
            or more TLVs was not understood") and set the Subcode to zero.
            Also, the SFF MAY include an Errored TLVs TLV (Section 6.4.1)
            that, as sub-TLVs, contains only the misunderstood TLVs.

       8.   Sender's Handle and Sequence Number fields are not examined but
            are copied in the SFC Echo Reply message.

       9.   If the sanity check of the received Echo Request succeeded, then
            the SFF at the end of the SFP MUST set the Return Code value to
            5 ("End of the SFP") and the Subcode set to zero.

       10.  If the SFF is not at the end of the SFP and the TTL value is 1,
            the value of the Return Code MUST be set to 4 ("TTL Exceeded")
            and the Subcode set to zero.

       11.  In all other cases, SFF MUST set the Return Code value to 0 ("No
            Return Code") and the Subcode set to zero.

```

** This section was confusing, is it describing Reception, Processing or
validation in the "Validation" steps?

- 3 and 5 appear to be sub-bullets of 1,
- 4 appears to be a sub bullet of 2,
- 6 doesn't provide a definition of "well-formed" so its not clear how that's
checked. - 8 does not appear to be relevant to validation and can be removed. -
9 does not specify what 'sanity check' may have failed. - 10 does not specify
what TTL value (NSH TTL?). - 11 appears to instruct SFFs to set the return code
to 0 "in all other cases". Is this an echo reply step?

```
    6.5.  SFC Echo Reply Transmission

       The "Reply Mode" field directs whether and how the Echo Reply message
       should be sent.  The Echo Request sender MAY use TLVs to request that
       the corresponding Echo Reply be transmitted over the specified path.
       Section 6.5.1 provides an example of a TLV that specifies the return
       path of the Echo Reply.  Value 1 is the "Do not reply" mode and
       suppresses the Echo Reply packet transmission.  The default value (2)
       for the Reply mode field requests sending the Echo Reply packet out-
       of-band as an IPv4 or IPv6 UDP packet.
```

** Should this section not be an exhaustive description of when and how to send
echo reply messages to an echo request? For example "Theory of Operation"
doesn't mention the Source TLV but it had a requirement on replies "The value
of the field MUST be used as the destination IP address in the IP/UDP
encapsulation of the SFC Echo Reply message."

** the "6.5.1 Reply Service Function Path TLV" seems out of place here, is it a
TLV sent in a request or reply?

```
       The destination SFF of the SFP being tested or the SFF at which SFC
       TTL expired (as per [RFC8300]) may be sending the Echo Reply is
       referred to as responding SFF.  The processing described below
       equally applies to both cases.
```

** "may be sending the Echo Reply" is not computing, is it a typo? Remove it?

```
    6.5.4.  Tracing an SFP
       SFC Echo Request/Reply can be used to isolate a defect detected in
       the SFP and trace an RSP.  As with ICMP echo request/reply [RFC0792]
       and MPLS echo request/reply [RFC8029], this mode is referred to as
       "traceroute".  In the traceroute mode, the sender transmits a
       sequence of SFC Echo Request messages starting with the NSH TTL value
       set to 1 and is incremented by 1 in each next Echo Request packet.
       The sender stops transmitting SFC Echo Request packets when the
       Return Code in the received Echo Reply equals 5 ("End of the SFP").
```

** What does an implementation do when TTL wraps?

```
    6.6.  Verification of the SFP Consistency

       The consistency of an SFP can be verified by comparing the view of
       the SFP from the control or management plane with information
       collected from traversing by an SFC NSH Echo Request message.  Every
       SFF that receives a Consistency Verification Request (CVReq)
       (specified in Section 6.6.1) MUST perform the following actions:

       *  Collect information about the SFs traversed by the CVReq packet
          and send it to the ingress SFF as CVRep packet over IP network;

       *  Forward the CVReq to the next downstream SFF if the one exists.

       As a result, the ingress SFF collects information about all traversed
       SFFs and SFs, information on the actual path the CVReq packet has
       traveled.  That information can be used to verify the SFC's path
       consistency.  The mechanism for the SFP consistency verification is
       outside the scope of this document.

    6.6.1.  SFP Consistency Verification packet

       For the verification of an SFP consistency, two types of SFC Active
       OAM messages are defined in addition to the SFC Echo Request/Reply
       messages.  Their SFC Echo Request/Echo Response Message Types are as
       follows:

       *  3 - SFP Consistency Verification Request

       *  4 - SFP Consistency Verification Reply

       Upon receiving the CVReq, the SFF MUST respond with the Consistency
       Verification Reply (CVRep).  The SFF MUST include the SFs
       information, as described in Section 6.6.3 and Section 6.6.2.
```

** 6.6 specified "Every SFF that receives a Consistency Verification Request
(CVReq)" and 6.6.1 just describes those as type 3 (and 4 for reply). the
information appears duplicated, am I reading this correctly? Why is 6.6.1 a
separate section and type 3 and 4 defined in 6.6?

```
    6.6.2.  SFF Information Record TLV

       For the received CVReq, an SFF is expected to include in the CVRep
       message the information about SFs that are available from that SFF
       instance for the specified SFP.  The SFF MUST include SFF Information
       Record TLV (Figure 9) in CVRep message.
```

** "expected to include the information about SFs" or "MUST include SFF
Information Record TLV"?  Is it a MUST or just expected?