Re: [sfc] Regarding last call for draft-ietf-sfc-multi-layer-oam

Greg Mirsky <gregimirsky@gmail.com> Sat, 20 November 2021 00:11 UTC

Return-Path: <gregimirsky@gmail.com>
X-Original-To: sfc@ietfa.amsl.com
Delivered-To: sfc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4FAFC3A0826 for <sfc@ietfa.amsl.com>; Fri, 19 Nov 2021 16:11:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZvdOmmUPFG6u for <sfc@ietfa.amsl.com>; Fri, 19 Nov 2021 16:11:19 -0800 (PST)
Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4EF313A0818 for <sfc@ietf.org>; Fri, 19 Nov 2021 16:11:19 -0800 (PST)
Received: by mail-ed1-x530.google.com with SMTP id z5so49344973edd.3 for <sfc@ietf.org>; Fri, 19 Nov 2021 16:11:19 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Id7JIreCUXqbZ3i6uk78BiwRxU55YyQNrP1IVN/ZqBk=; b=HIU3Z8Biss1w7+/m2GNQhQL1tQU18PYXxGPfJfytliZWfEL6iBw3WfsAzsFBqCJNSH reDxBUar2u4Jl11YeoRpAVWZ7UjCDBu73IZPcY6uXfZqXlvToU1WWWJ8Is9zJCEwTkVH XTevhbWikmL1wxGeQHTfy3HV+XkwFNmyZd0DOLXk2a5JLF8zub0T0o3WySybiTSdReTq 3hDDiy2lfXu2fG27+cGI6X0e2xGD7jQnSe2P13hMxZjaOeZy2ojp/x9OC6OgKak4RW6S 0ovIZ44Z1x3LkxHuNO4ofIyr+4QoaAvQW720RU3/51pAyJZnekv67LnYnf2tbte/L0zf MjCw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Id7JIreCUXqbZ3i6uk78BiwRxU55YyQNrP1IVN/ZqBk=; b=nKElEhpaLYCQd+tRadpjZDpYfqCd9aiHzVcZoBbjJc+ygqzEp2+vZiFlmsVXzU5VxF ScJwjCTQYyHaPkreTlVIGyQQtKErk5c5tVN3nDxUoDvG1kYqGEjMMTbqnNAaBsANqGfK iqntDPVWJQ/6FXf4rmXZXvEkH/Xv94Rz3fMeh/TLRsl6RzYTSD/rX7+SrqcRoG36XoH4 x0XfL9RdS7+aDfB7w6zjjr0WvzTzu/IGa/0cSeWxKV4b66q3zXUdjJWY+S5YovZxoBvu uhV2mGir5I703VYX7DLqa5tsGtwYJpsRWC2Kw4JHXxLPqom3jW5FkfP87TvuZjH9T/Vt oa8Q==
X-Gm-Message-State: AOAM533hkS83EjflPUJRBLrVGk+1IRb9vXpIfy22qwNdOf1NyVkdKGsi aw9nuYyQNxu22Ofgja6pDH9i9gXjDEnUJin3bYEqEaAlH8w=
X-Google-Smtp-Source: ABdhPJyQ//7B3DazOfHClTy1eDAHZWR7fWk5FaZVSuXSczOvxlP/9TWxUeEMf/tmua+7O7bvyjzKgM/uNKPngQRQFts=
X-Received: by 2002:a17:906:eda3:: with SMTP id sa3mr13324846ejb.51.1637367077072; Fri, 19 Nov 2021 16:11:17 -0800 (PST)
MIME-Version: 1.0
References: <4bb5abb4-a8dc-c8f0-9b99-549f683e7729@joelhalpern.com> <05FDF1D8-6CBD-403B-8F51-88E51346A36F@cisco.com>
In-Reply-To: <05FDF1D8-6CBD-403B-8F51-88E51346A36F@cisco.com>
From: Greg Mirsky <gregimirsky@gmail.com>
Date: Fri, 19 Nov 2021 16:11:05 -0800
Message-ID: <CA+RyBmXHhjyqTtc0pVtwmTRku-SV+0cFf7tFL_xOHnQ56xBvfQ@mail.gmail.com>
To: "Carlos Pignataro (cpignata)" <cpignata@cisco.com>
Cc: Joel Halpern Direct <jmh.direct@joelhalpern.com>, "sfc@ietf.org" <sfc@ietf.org>, James N Guichard <james.n.guichard@futurewei.com>
Content-Type: multipart/alternative; boundary="000000000000fef5dd05d12d3abd"
Archived-At: <https://mailarchive.ietf.org/arch/msg/sfc/UC79Xr-piq4V9ji3hbkO_wKL55A>
Subject: Re: [sfc] Regarding last call for draft-ietf-sfc-multi-layer-oam
X-BeenThere: sfc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Network Service Chaining <sfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sfc>, <mailto:sfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sfc/>
List-Post: <mailto:sfc@ietf.org>
List-Help: <mailto:sfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sfc>, <mailto:sfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Nov 2021 00:11:24 -0000

Dear Carlos,
thank you for your thorough review and detailed comments. Please find
responses in-lined below under the GIM>> tag.

Regards,
Greg (on behalf of the authors)

On Sat, Nov 13, 2021 at 11:50 PM Carlos Pignataro (cpignata) <
cpignata@cisco.com> wrote:

> Hello, WG,
>
> In reviewing draft-ietf-sfc-multi-layer-oam-16, I find that the issues
> listed below are such that I cannot support publication.
>
> Observing what appears to be a single non-author response to the original
> WGLC email, and one more after this extension, I also perceive the energy
> level to work on this to be low.
>
> Please find some review comments and observations, I hope these are useful:
>
>
>                 Active OAM for Service Function Chaining
>                    draft-ietf-sfc-multi-layer-oam-16
> Abstract
>
>    A set of requirements for active Operation, Administration, and
>    Maintenance (OAM) of Service Function Chains (SFCs) in a network is
>    presented in this document.  Based on these requirements, an
>    encapsulation of active OAM messages in SFC and a mechanism to detect
>    and localize defects are described.
>
>
> First, a generic comment on the whole document: Even though the WG
> produces an SFC OAM framework in rfc8924, I cannot find exactly how
> draft-ietf-sfc-multi-layer-oam follows or maps to such framework.
>
>    - rfc8924 lists requirements in S4, but this document mentions them in
>    passing. Instead, as per the Abstract above, this document creates new
>    requirements and based on them creates a new OAM protocol.
>
> GIM>> We've followed the requirements listed in RFC 8924 and used them
when designing SFC Echo Request/Reply. SFC Echo Request/Reply addresses the
essential requirements in Section 4 of RFC 8924.

>
>    - rfc8924 lists candidate SFC OAM tools, but this document does not
>    consider them. Or compare requirements to options. Perhaps I could be
>    pointed to the discussion on the list?
>
> GIM>> RFC 8924 already provides the analysis and pointed out gaps in
listed protocols. RFC 8924 has concluded that none of the available tools
complies with the requirements.

>
> Additionally, I wonder: Why the file name “sfc-multi-layer-oam”?
>
GIM>> It is historical.

>
>
>    Active OAM tools,
>    conformant to the requirements listed in Section 3, improve, for
>    example, troubleshooting efficiency and defect localization in SFP
>    because they specifically address the architectural principles of
>    NSH.  For that purpose, SFC Echo Request and Echo Reply are specified
>    in Section 6.
>
>
> I do not fully follow these cause-consequence pair of sentences. They seem
> to be foundational to the rational of the document, is this why a new OAM
> protocol is used?
>
GIM>> Indeed. Based on the analysis in RFC 8924, we've learned that none of
the available OAM tools can address the requirements for active SFP OAM.
The SFC Echo Request/Reply is specifically designed to address these
requirements.

>
> Specifically, I feel this document over-reaches in that it presumes that
> the only “Active OAM” protocol for NSH SFCs is this new protocol, whereas
> some of the existing protocols listed in rfc8924 are also “Active OAM”.
>
GIM>> I think that the document is positioned not as a general active OAM
protocol but as one of the active SFC NSH OAM protocols.

>
>    This mechanism enables on-demand Continuity Check,
>    Connectivity Verification, among other operations over SFC in
>    networks, addresses functionalities discussed in Sections 4.1, 4.2,
>    and 4.3 of [RFC8924].
>
>
> This could be well the case — however many others (including existing)
> mechanisms also enable in these broad terms all the
> connectivity+continuity+trace functions.
>
GIM>> We are not questioning that there are other solutions. But these
mechanisms are not supported by specifications that ensure independent
interoperable implementations.

> At the same time, this mechanisms is very complex.
> I would like to see a study of comparative benefits of this added
> complexity vis-a-vis existing approaches that can be extended.
>
GIM>> In the face of absence of sufficient and up to date documentation
describing proprietary solutions, I don't see that any comparison can be
comprehensive.

>
>
>    The ingress may be
>    capable of recovering from the failure, e.g., using redundant SFC
>    elements.  Thus, it is beneficial for the egress to signal the new
>    defect state to the ingress, which in this example is the Classifier.
>    Hence the following requirement:
>
>       REQ#3: SFC OAM MUST support Remote Defect Indication notification
>       by the egress to the ingress.
>
>
> I see a gap between “it is beneficial” and “MUST”. What is "Remote Defect
> Indication” in the context of SFC OAM since it is not in the OAM framework?
> Is this "Remote Defect Indication” the only way to achieve the rerouting or
> redundancy triggering?
>
GIM>> That is one of possible solutions. Other mechanisms may conform to
the requirement using different approach.

>
>
> 4.  Active OAM Identification in the NSH
>
>    The O bit in the NSH is defined in [RFC8300] as follows:
>
>       O bit: Setting this bit indicates an OAM packet.
>
>    This document updates that definition as follows:
>
>       O bit: Setting this bit indicates an OAM command and/or data in
>       the NSH Context Header or packet payload.
>
>    Active SFC OAM is defined as a combination of OAM commands and/or
>    data included in a message that immediately follows the NSH.  To
>    identify the active OAM message, the "Next Protocol" field MUST be
>    set to Active SFC OAM (TBA1) (Section 9.1).
>
>
> This is an example of over-reach. A “Next Protocol” pointing to IPv4, in
> turn pointing to ICMP, in turn pointing to Echo is already one example of
> “Active SFC OAM”. I wonder if this new protocol might be best served by
> choosing a name that is not so generic? It could be called “One of many
> active SFC OAM protocols” :-)
>
GIM>> Will clarify that throughout the document "active OAM" and "active
SFC OAM" refers to specially constructed packets that immediately follow
the SFC Active OAM Header (Figure 2).

>
> Otherwise, the “MUST” in the last sentence seems to not follow.
>
>    The rules for
>    interpreting the values of the O bit and the "Next Protocol" field
>    are as follows:
>
>
> I am extremely concerned about this attempted re-definition (of the O-bit
> and Protocol fields). On several fronts as explained below. During RFC8300
> the WG evaluated these and provided a solution already.
>
>    *  O bit set and the "Next Protocol" value does not match one of
>       identifying active or hybrid OAM protocols (per classification
>       defined in [RFC7799]), e.g., defined in Section 9.1 Active SFC OAM
>       (TBA1).
>
> This potentially breaks the concept of nodes not understanding OAM (i.e,.
> Partial deployment of a new protocol)
>
GIM>> Can you clarify what do you mean by "nodes not understanding OAM"?
Partial deployment is, in my opinion, an operational issue. An operator
plans deployments of new releases according to new features and their
intended use.

>
>          - a Fixed-Length Context Header or Variable-Length Context
>          Header(s) contain an OAM command or data.
>
>          - the "Next Protocol" field determines the type of payload.
>
> The semantic of Context Headers is outside this definition. For example
> the types in MD Type 2 define the variable headers.
>
> This potentially breaks also OAM, since things like ECMP can be encoded in
> context headers that the OAM needs. (e.g., "Flow ID”
> from draft-ietf-sfc-nsh-tlv).
>
GIM>> As I understand it, MD Type 2 Flow ID TLV is recommended to identify
a flow in SFC NSH. The document makes the use of this method.

>
> Further, is this describing a Hybrid OAM use?
>
GIM>> No, the document does not describe the use of hybrid OAM (per RFC
7799).

>
>    *  O bit set and the "Next Protocol" value matches one of identifying
>       active or hybrid OAM protocols:
>
>          - the payload that immediately follows the NSH MUST contain an
>          OAM command or data.
>
> This is also unclear — what is an OAM command or data? If the O-bit is
> set, it is an OAM packet.
>
GIM>> What is an OAM packet? Is an SFC NSH packet with IOAM an OAM packet
or not? If an SFC NSH packet is part of flow under the Alternate Marking,
is it an OAM packet because the Alternate Marking method is an example of
the hybrid OAM?

>
>    *  O bit is clear:
>
>          - no OAM in a Fixed-Length Context Header or Variable-Length
>          Context Header(s).
>
>          - the payload determined by the "Next Protocol" field MUST be
>          present.
>
> It is unclear the rational for this.
>
GIM>> Can you please clarify your interpretation, so we can look for ways
to improve the text?

>
>    *  O bit is clear, and the "Next Protocol" field identifies active or
>       hybrid OAM protocol MUST be identified and reported as an
>       erroneous combination.  An implementation MAY have control to
>       enable processing of the OAM payload.
>
> This seems to break the existing usage in draft-ietf-sfc-ioam-nsh. Section
> 4.2 of draft-ietf-sfc-ioam-nsh says clearly:
>
GIM>> I don't see any problem. In fact, both definitions are in sync.
According to draft-ietf-sfc-ioam-nsh if the Next Protocol field identifies
a use data payload, e.g., IPv6, then O bit MUST NOT be set. If the Next
Protocol is set to IOAM, then the O-bit MUST be set. We agree in how O-bit
works in presence of IOAM that accompanies user data and without it.

>
> 4.2.  IOAM and the use of the NSH O-bit
>
>    [RFC8300] defines an "O bit" for OAM packets.  Per [RFC8300] the O
>    bit must be set for OAM packets and must not be set for non-OAM
>    packets.  Packets with IOAM data included MUST follow this
>    definition, i.e. the O bit MUST NOT be set for regular customer
>    traffic which also carries IOAM data and the O bit MUST be set for
>    OAM packets which carry only IOAM data without any regular data
>    payload.
>
>
>
> 5.  Active SFC OAM Header
>
>    As demonstrated in Section 4 [RFC8924] and Section 3 of this
>    document, SFC OAM is required to perform multiple tasks.  Several
>    active OAM protocols could be used to address all the requirements.
>    When IP/UDP encapsulation of an SFC OAM control message is used,
>    protocols can be demultiplexed using the destination UDP port number.
>    But extra IP/UDP headers, especially in an IPv6 network, add
>    noticeable overhead.  This document defines Active OAM Header
>    (Figure 2) to demultiplex active OAM protocols on an SFC.
>
>
> Does this paragraph imply that the main reason for this protocol is this
> perceived overhead? If so, experience seems to show that in practice
> IP-encaped OAM works fine (as e.g., for LSP Ping).
>
GIM>> Isn't IP/UDP encapsulation, and IPv6 in particular, is a larger
overhead?

>
> Alternatively, “Next Protocols” could be defined for “raw” existing
> protocols.
>
>       Msg Type - six bits long field identifies OAM protocol, e.g., Echo
>       Request/Reply or Bidirectional Forwarding Detection.
>
>
> Why does BFD get encapsulated in this new protocol, as opposed to using a
> “Next Protocol” for it? That looks like unnecessary overhead and
> indirection.
>
GIM>> Are you proposing assigning different Next Protocol values for every
possible active OAM protocol?

>
>       Flags - eight bits long field carries bit flags that define
>       optional capability and thus processing of the SFC active OAM
>       control packet, e.g., optional timestamping.
>
> Does this timestamp conflict with context header timestamps? E.g., rfc8592
> or draft-mymb-sfc-nsh-allocation-timestamp.
>
GIM>> What do you see as a potential conflict?

>
> 6.  Echo Request/Echo Reply for SFC
>
>    Echo Request/Reply is a well-known active OAM mechanism extensively
>    used to verify a path's continuity, detect inconsistencies between a
>    state in control and the data planes, and localize defects in the
>    data plane.  ICMP ([RFC0792] for IPv4 and [RFC4443] for IPv6
>    networks, respectively) and [RFC8029] are examples of broadly used
>    active OAM protocols based on the Echo Request/Reply principle.  The
>    SFC Echo Request/Reply defined in this document addresses several
>    requirements listed in Section 3.  Specifically, it can be used to
>    check the continuity of an SFP, trace an SFP, or localize the failure
>    within an SFP.  The SFC Echo Request/Reply control message format is
>    presented in Figure 3.
>
>
> This seems to be an important paragraph — would be useful to also
> understand how other existing and broadly used protocols cannot fulfill
> requirements.
>
GIM>> RFC 8924 already provided a comprehensive analysis and concluded that
none of the available tools can fully conform to the requirements listed in
Section 4.

>
>       Length - two-octet-long field equal to the Value field's length in
>       octets.
>
>
> There are several nested lengths defined in this document — would be
> useful to analyze that they do not result in issues such as piggybacking
> unaccounted data.
>
GIM>> Do you see any scenario when that might be the case?

>
> 6.3.1.  Source TLV
>
>    Responder to the SFC Echo Request encapsulates the SFC Echo Reply
>    message in IP/UDP packet if the Reply mode is "Reply via an IPv4/IPv6
>    UDP Packet".  Because the NSH does not identify the ingress node that
>    generated the Echo Request, the source ID MUST be included in the
>    message and used as the IP destination address and destination UDP
>    port number of the SFC Echo Reply.  The sender of the SFC Echo
>    Request MUST include an SFC Source TLV (Figure 5).
>
>
> This seems to negate the benefit of less overhead, if the IP/UDP fields
> are embedded as OAM TLVs.
>
GIM>> Only the Source ID is required, not the whole set of IP and UDP
headers.

>
> This also seems to be a bit of an invitation for an attack.
>
>
> 6.4.1.  Errored TLVs TLV
>
>
> I wonder at this point if it is easier to use LSP Ping directly instead of
> re-define it.
>
GIM>> If someone wants to explore that option, of course.

>
> 6.5.1.  SFC Reply Path TLV
>
> …
>
>    *  Service Index: the value for the Service Index field in the NSH of
>       the SFC Echo Reply message.
>
> How is the service index in a reply constructed?
>
GIM>> It is provided by the sender of the SFC Echo Request.

>
>
> 6.5.3.  SFC Echo Reply Reception
>
>    An SFF SHOULD NOT accept SFC Echo Reply unless the received message
>    passes the following checks:
>
>    *  the received SFC Echo Reply is well-formed;
>
>    *  it has an outstanding SFC Echo Request sent from the UDP port that
>       matches destination UDP port number of the received packet;
>
>
> Is the demultiplexing based on UDP, OAM handle, or combination?
>
GIM>> The values of the Sender's Handle and  Sequence Number fields can be
used.

>
> 6.6.  Verification of the SFP Consistency
>
>    *  Collect information of the traversed by the CVReq packet SFs and
>       send it to the ingress SFF as CVRep packet over IP network;
>
>
> What if NSH is not over IP?
>
GIM>> Then the operator will specify another method using the Reply mode.

>
>    SF Type: Two octets long field.  It is defined in [RFC9015] and
>    indicates the type of SF, e.g., Firewall, Deep Packet Inspection, WAN
>    optimization controller, etc.
>
>
> Is RFC 9015 a hard dependency to implement this OAM?
>
GIM>> RFC 9015 established the IANA registry of SF Type and any new SF
types must be registered.

>
>    IANA is requested to assign a new type from the SFC Active OAM
>    Message Type sub-registry as follows:
>
>           +=======+=============================+===============+
>           | Value |         Description         | Reference     |
>           +=======+=============================+===============+
>           | TBA2  | SFC Echo Request/Echo Reply | This document |
>           +-------+-----------------------------+---------------+
>
>
> Is there a single value for both Request and Reply?
>
GIM>> Yes, it is a single value. Echo Request and Echo Reply are identified
in the Message Type field (Figure 3).

>
> 9.2.1.  Version in the Active SFC OAM Header
>
> 9.3.1.  SFC Echo Request/Reply Version
>
>
> There seems to be a version for the OAM and a version for the msg type. Is
> this correct? Are they hierarchical versions? Or independent?
> This seems to overly complicate parsing and compliance.
>
GIM>> All versions are independent.

>
> 9.3.3.  SFC Echo Request/Echo Reply Message Types
>
> Does this mean that there’s a protocol number for “Active OAM” with a
> protocol number for “Request/Reply” with a protocol number for either
> request or reply?
>
GIM>> These are not all protocol numbers. Only the Active OAM is a new
protocol number. Others are message types.

>
>    Values defined for the Return Codes sub-registry are listed in
>    Table 14.
>
>
> Various values in this table are not defined in the document. The
> procedures seem lacking.
>
GIM>> Other specifications may define additional code points in the
registry.

>
> 9.7.  SF Identifier Types
>
> This document seems to be creating a space for identifying SFs — which I
> thought was mostly outside the scope of OAM to test SFs.
>
GIM>> The registry is of SF Identifiers, not of SF Types (that already
exists). Hope that clarifies the issue.

>
> Does this further imply that there’s a new requirement to have unique
> identifiers within the domain for all SFs?
>
> I hope these comments and review questions and concerns are useful for the
> WG discussion and consideration.
>
> Thanks,
>
> Carlos.
>
>
> Nov 1, 2021 2:50 PM、Joel Halpern Direct <jmh.direct@joelhalpern.com>のメールt;のメール:
>
> I have received a polite request with explanation for delay asking for
> more time to read and review the subject document.  Given the state of the
> working group, i want to encourage any and all review.  So I am extending
> the last call by two additional weeks.
>
> Please read and review the document.
> Also, if you are willing to serve as shepherd for this, please let the
> chairs know.  (Don't worry if you have not shepherded a document before.
> The chairs are more than happy to help you with the process.)
>
> Thank you,
> Joel
>
> _______________________________________________
> sfc mailing list
> sfc@ietf.org
> https://www.ietf.org/mailman/listinfo/sfc
>
>
>