Re: [Tsv-art] Tsvart telechat review of draft-ietf-sfc-oam-framework-13

Joel Halpern Direct <jmh.direct@joelhalpern.com> Fri, 22 May 2020 15:17 UTC

Return-Path: <jmh.direct@joelhalpern.com>
X-Original-To: tsv-art@ietfa.amsl.com
Delivered-To: tsv-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3FB1E3A0AC9; Fri, 22 May 2020 08:17:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=joelhalpern.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1ZYVoMRzsfeL; Fri, 22 May 2020 08:17:52 -0700 (PDT)
Received: from mailb2.tigertech.net (mailb2.tigertech.net [208.80.4.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E25F93A0A39; Fri, 22 May 2020 08:17:51 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by mailb2.tigertech.net (Postfix) with ESMTP id 49T98v4rxRz1nw0y; Fri, 22 May 2020 08:17:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelhalpern.com; s=2.tigertech; t=1590160671; bh=XjSQVDOqRf3qxTksR8UwII0N2znUhS8+cSjabEIG92E=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=M2qEzfJii5R3W33hSd2di7E1X0YxtLXZf3Z5lAKVjPt8/LQK08Fg/E1byoU2+X+/s XGLK0S0NRGJwnPRrSCwj7jIWCA+KirvFnIPDw5XWrmHHZiQidaLf9ZQe05ushaXYfs ag1HaRoIcsD2cgZ3CA7OSmt90tAHm9D1/ffC5ORo=
X-Virus-Scanned: Debian amavisd-new at b2.tigertech.net
Received: from [192.168.128.43] (209-255-163-147.ip.mcleodusa.net [209.255.163.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mailb2.tigertech.net (Postfix) with ESMTPSA id 49T98t56ZJz1ntZg; Fri, 22 May 2020 08:17:50 -0700 (PDT)
To: "Frank Brockners (fbrockne)" <fbrockne@cisco.com>, "Nagendra Kumar Nainar (naikumar)" <naikumar@cisco.com>, "tsv-art@ietf.org" <tsv-art@ietf.org>
Cc: "sfc@ietf.org" <sfc@ietf.org>, "last-call@ietf.org" <last-call@ietf.org>, "draft-ietf-sfc-oam-framework.all@ietf.org" <draft-ietf-sfc-oam-framework.all@ietf.org>
References: <158861910132.5213.12389985411421411727@ietfa.amsl.com> <B12ACAA0-BFBC-40D6-85D2-A7E056027C68@cisco.com> <BYAPR11MB2584D5A59E020682810099EFDAB60@BYAPR11MB2584.namprd11.prod.outlook.com> <e83de1dc-1f39-6281-7687-b6dd52567685@joelhalpern.com> <BYAPR11MB258403E216B7396323CD836EDAB60@BYAPR11MB2584.namprd11.prod.outlook.com> <e2b4f43f-4732-7276-f449-f62c1b97c259@joelhalpern.com> <BYAPR11MB2584A602FA170F2D85D4DC67DAB40@BYAPR11MB2584.namprd11.prod.outlook.com> <17ce4af7-054e-226c-a97f-d37a2c9480f3@joelhalpern.com> <BYAPR11MB258416CFBC777A163B892BBCDAB40@BYAPR11MB2584.namprd11.prod.outlook.com>
From: Joel Halpern Direct <jmh.direct@joelhalpern.com>
Message-ID: <6d01e977-096c-b569-bc16-fca0f10ced76@joelhalpern.com>
Date: Fri, 22 May 2020 11:17:48 -0400
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0
MIME-Version: 1.0
In-Reply-To: <BYAPR11MB258416CFBC777A163B892BBCDAB40@BYAPR11MB2584.namprd11.prod.outlook.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-art/_LvmePDgZS1PNnIhGXGyIlNPxa4>
Subject: Re: [Tsv-art] Tsvart telechat review of draft-ietf-sfc-oam-framework-13
X-BeenThere: tsv-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Review Team <tsv-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-art/>
List-Post: <mailto:tsv-art@ietf.org>
List-Help: <mailto:tsv-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 May 2020 15:17:56 -0000

Thanks Frank.  Both for the review, and the discussion.
Yours,
Joel

On 5/22/2020 11:03 AM, Frank Brockners (fbrockne) wrote:
> Per what I mentioned below - the below is my personal take, taking the position of a reader, who might not be aware of WG structures, WG scopes etc. Personally, I don't think that a reader of the SFC OAM framework document would expect the SFC WG to address all of the dimensions of SFC OAM, even if the document lays them out.
> 
> That said, I do understand your concern about WG structures and scope. Looks like we (sadly) need to follow Conway's law ...
> So in general terms, I'm ok with Nagendra's approach.
> 
> Cheers, Frank
> 
> 
>> -----Original Message-----
>> From: Joel M. Halpern <jmh@joelhalpern.com>
>> Sent: Freitag, 22. Mai 2020 16:49
>> To: Frank Brockners (fbrockne) <fbrockne@cisco.com>; Nagendra Kumar Nainar
>> (naikumar) <naikumar@cisco.com>; tsv-art@ietf.org
>> Cc: sfc@ietf.org; last-call@ietf.org; draft-ietf-sfc-oam-framework.all@ietf.org
>> Subject: Re: Tsvart telechat review of draft-ietf-sfc-oam-framework-13
>>
>> My basic concern is that if we go down the path you outline, it would suggest to
>> the reader that the SFC working group expected to address the SF dimensions of
>> the problem.  I grant that they are real problems.  But the SFC working group has
>> neither the charter nor the expertise to address those problems.  As far as I can
>> tell, ART may have the expertise, and may not.  And has shown no interest in the
>> problem.  So I am very reluctant to put a lot of verbiage into the RFC on SF
>> monitoring.
>>
>> Given the above, can you live with Nagendra's text that you quote?
>>
>> Yours,
>> Joel
>>
>> On 5/22/2020 5:33 AM, Frank Brockners (fbrockne) wrote:
>>> Joel,
>>>
>>> Nagendra suggested "The task of evaluating the true availability of a
>>> Service Function is a complex activity, currently having no simple,
>>> unified solution.  There is currently no standard means of doing so.
>>> Any such mechanism would be far from a typical OAM function, so it is
>>> not explored as part of the analysis in Sections 4 and 5." in the
>>> related thread for discussion on availability.
>>> https://mailarchive.ietf.org/arch/msg/sfc/1ZsLw6m1OeJRJfHW2TygxOyKPJk/
>>>
>>> It could well be that my expectations for a framework document differ from
>> that of others. So please take this as a personal perspective.
>>> My expectation for a framework document is that it identifies and
>>> describes the components for a solution. It can optionally also hint
>>> at potential approaches to a solution (independently whether these
>>> solution are already in scope of the WG or not), but does not have to
>>> provide these solutions. The document does a good job at most of these
>>> - but misses out on those things, where we currently don't really have
>>> a solution available, which are questions like
>>> * How do we define availability for a SFC at service level? How do we define
>> availability for a SF?
>>> * How do we define performance for a SFC at service level? How do we define
>> performance for a SF?
>>> Right now the document focuses on those pieces of SF/SFC availability and
>> performance that are connectivity related ("Packets traverse it").
>>>
>>> A potential approach could be to decompose the problem, using the structure
>> the document lays out - and put this into the context of SLA definitions.
>>>
>>> Why can we just say for e.g. SFC performance that SFC performance is a
>>> composite of
>>> * link layer performance,
>>> * underlay performance,
>>> * overlay performance,
>>> * service layer performance.
>>> A SFC OAM solution needs to consider the performance measures across all
>> those layers.
>>> Consequently, for the performance of a SF, one needs to consider the aspects
>> at each layer:
>>> * link layer - Example "how well is the SF connected at e.g. Ethernet
>>> layer?" -> example: Leverage ITU-T Y.1731
>>> * underlay - Example "how well is the SF connected at e.g. IP layer?"
>>> -> example: Leverage OWAMP/TWAMP
>>> * overlay - Example "how well is the SF connected at e.g. NSH layer?"
>>> * service layer - Example "how well does the SF meet the criteria of an SLA?"
>>> So e.g. 3.1.2 and 3.2.2 could explicitly expand on these aspects and state that
>> apart from connectivity level measures (loss, throughput), service level criteria,
>> typically defined as part of an SLA, are included in SFC performance
>> characterization. IMHO this is a better approach, than either say "SF availability
>> is a hard problem" (which is what Nagendra says in the above statement - and
>> which is of course true) or just define the service aspects as out of scope for SFC
>> OAM ("how clean are the clothes when returned from the laundry?").
>>>
>>> My 2cents..
>>>
>>> Cheers, Frank
>>>
>>>
>>>> -----Original Message-----
>>>> From: Joel M. Halpern <jmh@joelhalpern.com>
>>>> Sent: Mittwoch, 20. Mai 2020 20:17
>>>> To: Frank Brockners (fbrockne) <fbrockne@cisco.com>; Nagendra Kumar
>>>> Nainar
>>>> (naikumar) <naikumar@cisco.com>; tsv-art@ietf.org
>>>> Cc: sfc@ietf.org; last-call@ietf.org;
>>>> draft-ietf-sfc-oam-framework.all@ietf.org
>>>> Subject: Re: Tsvart telechat review of
>>>> draft-ietf-sfc-oam-framework-13
>>>>
>>>> So the question now is whether the text Murray suggested suffices for
>>>> you?  (We are still waiting to hear from Alvaro.)
>>>>
>>>> Yours,
>>>> Joel
>>>>
>>>> On 5/20/2020 1:41 PM, Frank Brockners (fbrockne) wrote:
>>>>> Thanks Joel. Per what I mentioned below, let's be clear that SF
>>>>> performance is
>>>> out of scope for the doc.
>>>>> And I think this was Alvaro's point as well.
>>>>>
>>>>> Cheers, Frank
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Joel M. Halpern <jmh@joelhalpern.com>
>>>>>> Sent: Mittwoch, 20. Mai 2020 19:21
>>>>>> To: Frank Brockners (fbrockne) <fbrockne@cisco.com>; Nagendra Kumar
>>>>>> Nainar
>>>>>> (naikumar) <naikumar@cisco.com>; tsv-art@ietf.org
>>>>>> Cc: sfc@ietf.org; last-call@ietf.org;
>>>>>> draft-ietf-sfc-oam-framework.all@ietf.org
>>>>>> Subject: Re: Tsvart telechat review of
>>>>>> draft-ietf-sfc-oam-framework-13
>>>>>>
>>>>>> Frank, regarding your comment about SF performance, I thought the
>>>>>> document was pretty clear that we consider that out of scope (c.f.
>>>>>> the discussions with the various ADs.)
>>>>>>
>>>>>> If you can see a place to add text, please propose text.
>>>>>>
>>>>>> Thank you,
>>>>>> Joel
>>>>>>
>>>>>> On 5/20/2020 1:10 PM, Frank Brockners (fbrockne) wrote:
>>>>>>> Hi Nagendra,
>>>>>>>
>>>>>>> Thanks for the detailed reply. Please see inline (..FB).
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Nagendra Kumar Nainar (naikumar) <naikumar@cisco.com>
>>>>>>>> Sent: Samstag, 16. Mai 2020 16:16
>>>>>>>> To: Frank Brockners (fbrockne) <fbrockne@cisco.com>;
>>>>>>>> tsv-art@ietf.org
>>>>>>>> Cc: sfc@ietf.org; last-call@ietf.org;
>>>>>>>> draft-ietf-sfc-oam-framework.all@ietf.org
>>>>>>>> Subject: Re: Tsvart telechat review of
>>>>>>>> draft-ietf-sfc-oam-framework-13
>>>>>>>>
>>>>>>>> Hi Frank,
>>>>>>>>
>>>>>>>> Thank you for the review. Please see inline for the response..
>>>>>>>>
>>>>>>>>
>>>>>>>>         Reviewer: Frank Brockners
>>>>>>>>         Review result: Ready with Nits
>>>>>>>>
>>>>>>>>         This document has been reviewed as part of the transport
>>>>>>>> area review
>>>>>> team's
>>>>>>>>         ongoing effort to review key IETF documents. These
>>>>>>>> comments were
>>>>>> written
>>>>>>>>         primarily for the transport area directors, but are copied
>>>>>>>> to the
>>>>>> document's
>>>>>>>>         authors and WG to allow them to address any issues raised
>>>>>>>> and also to the IETF
>>>>>>>>         discussion list for information.
>>>>>>>>
>>>>>>>>         When done at the time of IETF Last Call, the authors
>>>>>>>> should consider
>>>> this
>>>>>>>>         review as part of the last-call comments they receive. Please always
>> CC
>>>>>>>>         tsv-art@ietf.org if you reply to or forward this review.
>>>>>>>>
>>>>>>>>         This document provides a reference framework for OAM for SFC.
>>>>>>>>
>>>>>>>>         Comments:
>>>>>>>>
>>>>>>>>         Section 3.1.1 SF availability: The text makes explicit
>>>>>>>> reference to
>>>> multiple
>>>>>>>>         instances of a SF. Consequently, it should be defined how
>>>>>>>> availability of a
>>>>>> SF
>>>>>>>>         is computed/determined in case multiple instances are deployed.
>>>>>>>>
>>>>>>>> <Nagendra> This is already clarified in the section as below:
>>>>>>>>
>>>>>>>> "For cases where
>>>>>>>>        multiple instances of an SF are used to realize a given SF for the
>>>>>>>>        purpose of load sharing, SF availability can be performed by checking
>>>>>>>>        the availability of any one of those instances, or the availability
>>>>>>>>        check may be targeted at a specific instance."
>>>>>>>>
>>>>>>>> This further
>>>>>>>>         leads to the question, whether availability is always a "binary" state
>>>>>>>>         (available / not-available), or could a SF be e.g. 99% available?
>>>>>>>>
>>>>>>>> <Nagendra>The availability is measured as binary state. I am not
>>>>>>>> sure what is 99% available. If it means getting 99 responses for
>>>>>>>> 100 probes sent, I think it falls under packet loss category
>>>>>>>> which in turn is
>>>>>> performance measurement.
>>>>>>>
>>>>>>> ...FB: Thanks. Though I'm still not entirely following. If
>>>>>>> availability is binary and
>>>>>> I put the statements above together, what would be the availability
>>>>>> of the following setup: There is an SF that is made up of 100
>>>>>> instances. 99 of these instances are powered down entirely. And the
>>>>>> 1 instance that is "up" is alternating between servicing requests
>>>>>> for 10min followed by not servicing requests for 10min. Would the
>>>>>> SF be
>>>> considered "available"?
>>>>>>>
>>>>>>>>
>>>>>>>> Section 3.1.2
>>>>>>>>         SF performance: What is the impact of a "multiple instance
>>>>>>>> SF deployment" on SF
>>>>>>>>         performance measurement?
>>>>>>>>
>>>>>>>> <Nagendra>I think we covered this in SF availability but not here.
>>>>>>>> Does the below updated text look better?
>>>>>>>>
>>>>>>>> OLD:
>>>>>>>> On the one hand, the performance of any specific SF can be quantified
>>>>>>>>        by measuring the loss and delay metrics of the traffic from SFF to
>>>>>>>>        the respective SF, while on the other hand, the performance can be
>>>>>>>>        measured by leveraging the loss and delay metrics from the
>> respective
>>>>>>>>        SFs.  The latter requires SF involvement to perform the measurement
>>>>>>>>        while the former does not.
>>>>>>>>
>>>>>>>> NEW:
>>>>>>>> On the one hand, the performance of any specific SF can be quantified
>>>>>>>>        by measuring the loss and delay metrics of the traffic from SFF to
>>>>>>>>        the respective SF, while on the other hand, the performance can be
>>>>>>>>        measured by leveraging the loss and delay metrics from the
>> respective
>>>>>>>>        SFs.  The latter requires SF involvement to perform the measurement
>>>>>>>>        while the former does not. For cases where
>>>>>>>>        multiple instances of an SF are used to realize a given SF for the
>>>>>>>>        purpose of load sharing, SF performance can be quantified by
>> measuring
>>>>>>>>        the metrics for any one instance of SF or by measuring the metrics
>> for
>>>>>>>>        a specific instance.
>>>>>>>>
>>>>>>>> The section only talks about loss and delay as
>>>>>>>>         performance criteria. It would be good to state that other
>>>>>>>> performance criteria
>>>>>>>>         (e.g. specific to the SF, throughput, etc.) exist.
>>>>>>>>
>>>>>>>> <Nagendra> We can add the below to Section 3.1.2:
>>>>>>>>
>>>>>>>> NEW:
>>>>>>>> "The metrics measured to quantify the performance of the SF
>>>>>>>> component is not just limited to loss and delay. Other metrics
>>>>>>>> such as throughout also exist and the choice of metrics for
>>>>>>>> performance measurement is outside the scope of this document."
>>>>>>>>
>>>>>>>> Section 3.2.1 SFC
>>>>>>>>         availability: The current definition is very focused on connectivity
>>>>>>>>         verification, i.e. it tries to answer the question: "Does my SFC
>> transport
>>>>>>>>         packets?". IMHO we should also ask the question "Does my
>>>>>>>> SFC process
>>>>>> the
>>>>>>>>         packets correctly?" - because if packets are not processed per the
>> SFC
>>>>>>>>         definition, we might not call the SFC available.
>>>>>>>>
>>>>>>>> <Nagendra> I think this is already handled by SF availability.
>>>>>>>> The end-to-end SFC availability is verified by steering the OAM
>>>>>>>> packet over the ordered set of SFs within the SFC. This is more
>>>>>>>> like daisy chaining the availability of SFs within the SFC to
>>>>>>>> determine end-to-end SFC availability. If the derived solution
>>>>>>>> verifies the SF availability not just based on the uptime but
>>>>>>>> based on the service treatment, it also answers the question
>>>>>>>> "Does my SFC process the packets
>>>>>> correctly". Let us know if there is any further clarity required.
>>>>>>>>
>>>>>>>> While 3.2.2 states that "any
>>>>>>>>         SFC-aware network device should have the ability to make
>> performance
>>>>>>>>         measurements" a similar statement isn't found in 3.2.1.
>>>>>>>> IMHO the ability
>>>>>> for
>>>>>>>>         availability checks is probably a prerequisite for
>>>>>>>> performance
>>>>>> measurement.
>>>>>>>>
>>>>>>>> <Nagendra> The ability to perform end-to-end or partial SFC
>>>>>>>> availability verification is already mentioned in section 3.2.1 as below:
>>>>>>>>
>>>>>>>> " In order to perform service connectivity verification of an SFC/SFP,
>>>>>>>>        the OAM functions could be initiated from any SFC-aware network
>>>>>>>>        devices of an SFC-enabled domain for end-to-end paths, or partial
>>>>>>>>        paths terminating on a specific SF, within the SFC/SFP"
>>>>>>>>
>>>>>>>> Please let us know if you have any suggestion to improve if there
>>>>>>>> is a lack of clarity.
>>>>>>>>
>>>>>>>>         Section 3.2.2 SFC performance measurement: The section
>>>>>>>> only mentions the need
>>>>>>>>         for performance measurement. It misses the definition of
>>>>>>>> what SFC performance
>>>>>>>>         measurement is.
>>>>>>>>
>>>>>>>> <Nagendra>
>>>>>>>
>>>>>>> ...FB: Thanks for the suggested updates, which would definitively
>>>>>>> improve the
>>>>>> text. One problem about SFC performance remains though IMHO.
>>>>>>> All the text so far is focused on the connectivity within a SFC -
>>>>>>> not the service
>>>>>> itself. I.e. If you'd consider a "laundry service" - we focus a lot
>>>>>> on how long it takes to get the clothes shipped to and from the
>>>>>> washing machine, but we don't focus on how well the washing machine
>>>> washes the clothes.
>>>>>>> IMHO we should either expand on the performance of the SFC and SF
>>>>>>> wrt/ the
>>>>>> service (especially given that you define a service layer in
>>>>>> section
>>>>>> 2) - or clearly state that the framework would just focus on
>>>>>> connectivity
>>>> between SFs.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Section 3.3. Classifier component: The section mentions the
>>>>>>>>         need for the ability to perform performance measurement of
>>>>>>>> the
>>>> classifier
>>>>>>>>         component. What is performance measurement of the classifier?
>>>>>>>> What
>>>>>> does
>>>>>>>>         performance measurement of the classifier component comprise?
>>>>>>>>
>>>>>>>> <Nagendra>We can add the below text:
>>>>>>>>
>>>>>>>> OLD:
>>>>>>>> Any SFC-aware network device should have the ability to perform
>>>>>>>>        performance measurement of the classifier component for each SFC.
>>>>>>>>
>>>>>>>> NEW:
>>>>>>>> Any SFC-aware network device should have the ability to perform
>>>>>>>>        performance measurement of the classifier component for each SFC.
>>>>>>>>         The performance can be quantified by measuring the
>>>>>>>> performance metrics of the
>>>>>>>>          traffic from the classifier for each SFC/SFP.
>>>>>>>>
>>>>>>>> Section 3.4. /
>>>>>>>>         3.5. Availability/PM of the underlay and overlay network:
>>>>>>>> It would be good
>>>>>> to
>>>>>>>>         add a sentence that states that the mechanisms for
>>>>>>>> availability/PM which
>>>>>> are
>>>>>>>>         offered by the technologies used by the overlay/underlay
>>>>>>>> are used, rather than
>>>>>>>>         new methods specifically for SFC would be defined.
>>>>>>>>
>>>>>>>> <Nagendra>Yes, that makes sense. Please check the below text:
>>>>>>>>
>>>>>>>> OLD:
>>>>>>>> Any SFC-aware network device may have the ability to perform
>>>>>>>>        availability check or performance measurement of the overlay
>> network.
>>>>>>>>
>>>>>>>> NEW:
>>>>>>>> Any SFC-aware network device may have the ability to perform
>>>>>>>>        availability check or performance measurement of the overlay
>> network.
>>>>>> Any
>>>>>>>>        existing OAM tools and techniques can be leveraged for this purpose.
>>>>>>>>
>>>>>>>> Section 4. SFC OAM
>>>>>>>>         Functions: It would be good, if examples in section 4
>>>>>>>> could also include
>>>>>> more
>>>>>>>>         "recent" methods such as OWAMP/TWAMP (RFC4656, RFC 5357).
>>>>>>>>
>>>>>>>> <Nagendra>
>>>>>>>>
>>>>>>>> OLD:
>>>>>>>> Delay within an SFC could be measured based on the time it takes for
>>>>>>>>        a packet to traverse the SFC from the ingress SFC node to the egress
>>>>>>>>        SFF.  As SFCs are unidirectional in nature, measurement of one-way
>>>>>>>>        delay [RFC7679] is important.  In order to measure one-way delay,
>>>>>>>>        time synchronization MUST be supported by means such as NTP, PTP,
>>>>>>>>        GPS, etc.
>>>>>>>>
>>>>>>>> NEW:
>>>>>>>> Delay within an SFC could be measured based on the time it takes for
>>>>>>>>        a packet to traverse the SFC from the ingress SFC node to the egress
>>>>>>>>        SFF.  Measurement protocols such as One-way Active Measurement
>>>>>>>>         Protocol (OWAMP) [RFC4656], Two-way Active Measurement
>> Protocol
>>>>>>>>        (TWAMP) [RFC5357] can be used to measure the characteristics. As
>>>>>>>>        SFCs are unidirectional in nature, measurement of one-way
>>>>>>>>        delay [RFC7679] is important.  In order to measure one-way delay,
>>>>>>>>        time synchronization MUST be supported by means such as
>>>>>>>> NTP, Precision Time Protocol (PTP),
>>>>>>>>        GPS, etc.
>>>>>>>>
>>>>>>>> Section 4.4.
>>>>>>>>         Performance Measurement: Focus is entirely on the PM of
>>>>>>>> the
>>>>>> connectivity,
>>>>>>>>         rather than on the SF. How about covering PM for the SF as well?
>>>>>>>>
>>>>>>>> <Nagendra> I am not sure I understand what is missing. Do you
>>>>>>>> have any suggestion for the text improvement?.
>>>>>>>
>>>>>>> ...FB: See above. This would be about adding a capability to
>>>>>>> assess how well
>>>>>> the washing machine washes my laundry.
>>>>>>>
>>>>>>>>
>>>>>>>> Section 5.1
>>>>>>>>         OAM Tool Gap Analysis:
>>>>>>>>          - Not sure what "NVo3 OAM" refers to. Could that be
>>>>>>>> explained below the table
>>>>>>>>          and in section 1.2.1?
>>>>>>>>
>>>>>>>> <Nagendra> Combining this with other below queries as they
>>>>>>>> appears to be related.
>>>>>>>>
>>>>>>>> - E-OAM needs to be detailed. Is seems that CFM
>>>>>>>>          (802.1ag) and not 802.3ah is refered to here.
>>>>>>>>
>>>>>>>> <Nagendra> Per my understanding, 802.ah is 1-hop while 802.3ag
>>>>>>>> can be more than 1 hop and both uses Ethernet frames. So I think
>>>>>>>> both are
>>>>>> applicable here.
>>>>>>>> My response regarding E-OAM details in this section is combined below.
>>>>>>>
>>>>>>> ...FB: Maybe I missed it - but I don't see text that refers to CFM
>>>>>>> or EFM
>>>> OAM.
>>>>>> Where is this covered? IMHO we would need references to IEEE
>>>>>> standards to avoid confusion.
>>>>>>>
>>>>>>>>
>>>>>>>> - "Trace" in the "Trace" column
>>>>>>>>          need to be extended on. Is this traceroute? Paris-Traceroute?
>>>>>>>> IOAM- Loopback?
>>>>>>>>
>>>>>>>>          IPPM needs to be detailed, because IPPM is not a tool as
>>>>>>>> such but an IETF WG.
>>>>>>>>          Does this refer to OWAMP/TWAMP/etc. as defined by IPPM?
>>>>>>>>
>>>>>>>> <Nagendra> Combining the above queries.
>>>>>>>>
>>>>>>>> OLD:
>>>>>>>> There are various OAM tool sets available to perform OAM functions
>>>>>>>>        within various layers.  These OAM functions may be used to validate
>>>>>>>>        some of the underlay and overlay networks.  Tools like ping and trace
>>>>>>>>        are in existence to perform connectivity check and tracing of
>>>>>>>>        intermediate hops in a network.  These tools support different
>>>>>>>>        network types like IP, MPLS, TRILL, etc.  There is also an effort to
>>>>>>>>        extend the tool set to provide connectivity and continuity checks
>>>>>>>>        within overlay networks.  BFD is another tool which helps in
>>>>>>>>        detecting data forwarding failures.  Table 3 below is not
>>>>>>>> exhaustive
>>>>>>>>
>>>>>>>> NEW:
>>>>>>>> There are various OAM tool sets available to perform OAM functions
>>>>>>>>        within various layers.  These OAM functions may be used to validate
>>>>>>>>        some of the underlay and overlay networks.  Tools like ping and trace
>>>>>>>>        are used to perform connectivity check and tracing of
>>>>>>>>        intermediate hops in a network.  These tools are already available for
>>>>>>>>        different types of networks such as IP, MPLS, TRILL, etc.
>>>>>>>>
>>>>>>>> E-OAM offers OAM mechanisms such as an Ethernet continuity check
>>>>>>>> for Ethernet links. There is an effort around NVO3 OAM to provide
>>>>>>>> connectivity and continuity checks for networks that use NVO3.
>>>>>>>> BFD is used for the detection of data plane forwarding failures.
>>>>>>>
>>>>>>> ...FB: Check whether NVO3 WG will indeed deliver a solution and
>>>>>>> "NVO3
>>>> OAM"
>>>>>> indeed existis. If in doubt, it might be better to avoid forward
>>>>>> looking references. Per my note above, it would be good to
>>>>>> explicitly refer to IEEE standards as opposed to introducing a new term like
>> "E-OAM".
>>>>>>>
>>>>>>>>
>>>>>>>> The IPPM framework [RFC 2330] offers tools such as OWAMP
>>>>>>>> [RFC4656] and TWAMP [RFC5357] (collectively referred as IPPM in
>>>>>>>> this section) to measure various performance metrics. MPLS Packet
>>>>>>>> Loss Measurement
>>>>>>>> (LM) and Packet Delay Measurement (DM) (collectively referred as
>>>>>>>> MPLS_PM in this section) [RFC6374] offers the ability to measure
>>>>>> performance metrics in MPLS network.
>>>>>>>>
>>>>>>>> Table 3 below is not exhaustive.
>>>>>>>>
>>>>>>>> Section 6.4.3 IOAM:
>>>>>>>>         - The section states that IOAM "may be used to perform
>>>>>>>> various SFC
>>>> OAM
>>>>>>>>         functions as well". It would be good to expand on this statement:
>> E.g.
>>>>>> IOAM
>>>>>>>>         Trace-Option Type could be leveraged for SFC tracing. IOAM
>>>>>>>> Direct-Export Option
>>>>>>>>         Type could be leveraged. - How would we deal with the IOAM
>>>>>>>> Active
>>>> Flag
>>>>>>>>         (draft-ietf-ippm-ioam-flags-01) when used with SFC OAM?
>>>>>>>>
>>>>>>>> <Nagendra> The intention of the section is to highlight the
>>>>>>>> applicability of different OAM toolsets for OAM functions at
>>>>>>>> service layer. I am not sure if we really should try explaining
>>>>>>>> all the possible options within each tool. But I agree that it is
>>>>>>>> worth clarifying the availability of IOAM options for tracing.
>>>>>>>> think we can clarify that different IOAM Option-Types are
>>>>>>>> available for OAM functions
>>>>>> such as SFC tracing. Can you check if the below looks ok?
>>>>>>>>
>>>>>>>> OLD:
>>>>>>>> [I-D.ietf-sfc-ioam-nsh] defines how In-Situ OAM data fields are
>>>>>>>>        transported using NSH header.  [I-D.ietf-sfc-proof-of-transit]
>>>>>>>>        defines a mechanism to perform proof of transit to securely verify if
>>>>>>>>        a packet traversed the relevant SFP or SFC.  While the mechanism is
>>>>>>>>        defined inband (i.e., it will be included in data packets), it may be
>>>>>>>>        used to perform various SFC OAM functions as well.
>>>>>>>>
>>>>>>>> NEW:
>>>>>>>> [I-D.ietf-sfc-ioam-nsh] defines how In-Situ OAM data fields are
>>>>>>>>        transported using NSH header.  [I-D.ietf-sfc-proof-of-transit]
>>>>>>>>        defines a mechanism to perform proof of transit to securely verify if
>>>>>>>>        a packet traversed the relevant SFP or SFC.  While the mechanism is
>>>>>>>>        defined inband (i.e., it will be included in data packets),
>>>>>>>> IOAM Option-
>>>> Types
>>>>>>>>       such as IOAM Trace Option-Types can also be used to perform
>>>>>>>> other SFC OAM function
>>>>>>>>       such as SFC tracing.
>>>>>>>>
>>>>>>>> - The text states
>>>>>>>>         "In-Situ OAM could be used with O bit set": Why would IOAM
>>>>>>>> be used with
>>>>>> the
>>>>>>>>         overflow bit set for SFC OAM? For details on IOAM's O-bit,
>>>>>>>> see section
>>>>>> 4.4.1 in
>>>>>>>>         https://tools.ietf.org/html/draft-ietf-ippm-ioam-data-09.
>>>>>>>>
>>>>>>>> <Nagendra> The O bit referred here is not the O bit in IOAM but
>>>>>>>> the one in NSH/Overlay header. To avoid any confusion, this can
>>>>>>>> be updated as
>>>>>> below:
>>>>>>>>
>>>>>>>> OLD:
>>>>>>>> In-Situ OAM could be used with O bit set to perform SF availability
>>>>>>>>        and SFC availability or performance measurement.
>>>>>>>>
>>>>>>>> NEW:
>>>>>>>> In-Situ OAM could be used with O bit in the overlay header set,
>>>>>>>> to perform SF availability
>>>>>>>>        and SFC availability or performance measurement.
>>>>>>>
>>>>>>> ... FB: Ah, ok. Given that this section is about IOAM and not NSH,
>>>>>>> I'd rather
>>>>>> explicitly refer to NSH here. E.g. If SFC is realized using NSH,
>>>>>> then the O-bit in the NSH header could be used to indicated OAM traffic.
>>>>>> You could refer to
>>>>>> https://tools.ietf.org/html/draft-ietf-sfc-ioam-nsh-03#section-4.2
>> explicitly.
>>>>>>>
>>>>>>>>
>>>>>>>> Section 6.4.4 SFC
>>>>>>>>         Traceroute: - This section refers to an expired draft
>>>>>>>> (even calling out
>>>> the
>>>>>>>>         fact that the draft has exipred), but also mentions that functionality
>> is
>>>>>>>>         available and implemented in OpenDaylight. Consider
>>>>>>>> removing the references to
>>>>>>>>         the expired draft and rather add references to
>>>>>>>> OpenDaylight documents. - IOAM
>>>>>>>>         Loopback (see draft-ietf-ippm-ioam-flags-01) could apply
>>>>>>>> SFC Traceroute as well.
>>>>>>>>
>>>>>>>> <Nagendra>Ok. Let me check if I can find some reference for ODL.
>>>>>>>>
>>>>>>>>         Detailed set of nits that I encountered while reading
>>>>>>>> through the document ([x]
>>>>>>>>         references line number x) – hope that they are helpful in
>>>>>>>> further improving
>>>>>> the
>>>>>>>>         doc:
>>>>>>>>
>>>>>>>> <Nagendra> Yes of course (.
>>>>>>>>
>>>>>>>>         [global] s/an SF/a SF/ -- and similarly SFC/SFF
>>>>>>>>
>>>>>>>> <Nagendra>Other RFCs uses "an SF/SFF". So the draft is updated
>>>>>>>> accordingly. If your suggestion is to substitute "a SF" to "an
>>>>>>>> SF",  it is done
>>>> (.
>>>>>>>>
>>>>>>>>         [176] "OAM Controller" not defined
>>>>>>>>
>>>>>>>> <Nagendra>We can change it as below:
>>>>>>>>
>>>>>>>> OLD:
>>>>>>>> OAM controllers are assumed to be within the same administrative
>>>>>>>>        domain as the target SFC enabled domain.
>>>>>>>>
>>>>>>>> NEW:
>>>>>>>> OAM controllers are SFC-aware network devices that are capable of
>>>>>>>> generating OAM packets. They are assumed to be within the same
>>>>>>>> administrative domain as the target SFC enabled domain.
>>>>>>>>
>>>>>>>>         [202] Why just Virtual Machines and no containers? Suggest
>>>>>>>> to make
>>>>>> things
>>>>>>>>         generic and talk about virtual and physical entities.
>>>>>>>>
>>>>>>>> <Nagendra> We changed this as virtual entities.
>>>>>>>>
>>>>>>>>               This comment applies throughout the document.
>>>>>>>>         [216] Ethernet OAM: Add reference. Do you refer to
>>>>>>>> physical layer Ethernet OAM
>>>>>>>>         (802.3ah) or CFM (802.1ag)?
>>>>>>>>
>>>>>>>> <Nagendra> The response was provided in the above comment section.
>>>>>>>>
>>>>>>>> [243] s/uses the overlay network/uses the overlay
>>>>>>>>         network layer/
>>>>>>>>
>>>>>>>> <Nagendra> Done.
>>>>>>>>
>>>>>>>> [246] Could we add a few examples of "various overlay network
>>>>>>>>         technologies"? For the underlay network layer several
>>>>>>>> examples are
>>>> listed.
>>>>>>>>
>>>>>>>> <Nagendra> Ok.
>>>>>>>>
>>>>>>>>         [248] What does "mostly transparent" mean?
>>>>>>>>
>>>>>>>> <Nagendra> The data plane elements connecting the overlay layer
>>>>>>>> nodes may not always process the overlay header.
>>>>>>>
>>>>>>> ...FB: How about we explain this in the document?
>>>>>>>
>>>>>>>>
>>>>>>>> [254] What does "tight coupling"
>>>>>>>>         between the link layer and the physical technology mean?
>>>>>>>>
>>>>>>>> <Nagendra>I am not sure I understand the nit here. Do you see any
>>>>>>>> difficulty in parsing the sentence?
>>>>>>>
>>>>>>> ...FB: Not sure what "tight coupling" means here. Could you
>>>>>>> clarify what is
>>>>>> "tight coupling" vs. "not tight coupling"?
>>>>>>>
>>>>>>>>
>>>>>>>> [255] Suggest to avoid
>>>>>>>>         terms like "popular" - popularity can change, standards
>>>>>>>> stay
>>>>>>>>
>>>>>>>> <Nagendra> Ok. This is changed as "Ethernet is one such choice..."
>>>>>>>>
>>>>>>>> [256] Acronyms
>>>>>>>>         "POS" and "DWDM" are not defined
>>>>>>>>
>>>>>>>> <Nagendra> Added.
>>>>>>>>
>>>>>>>> [274] Link start/end-points don't seem to
>>>>>>>>         always align with the underlay network in the diagram
>>>>>>>>
>>>>>>>> <Nagendra> Fixed it.
>>>>>>>>
>>>>>>>> [287] s/may comprise
>>>>>>>>         of/may consist of/
>>>>>>>>
>>>>>>>> <Nagendra>We fixed it as "may comprise"..
>>>>>>>>
>>>>>>>> [288] s/but not shown/but is not shown/
>>>>>>>>
>>>>>>>> <Nagendra> We fixed this as "intermediate nodes not shown...:
>>>>>>>>
>>>>>>>> [307]
>>>>>>>>         s/devices/device/
>>>>>>>>
>>>>>>>> <Nagendra> Done.
>>>>>>>>
>>>>>>>> [308] What is a "controller"?
>>>>>>>>
>>>>>>>> <Nagendra> We discussed this in the above comment section.
>>>>>>>>
>>>>>>>> [314] s/includes/include/
>>>>>>>>
>>>>>>>> <Nagendra>Done.
>>>>>>>>
>>>>>>>> [319]
>>>>>>>>         Add hSFC to list of acronyms in section 1.2.1
>>>>>>>>
>>>>>>>> <Nagendra> This is expanded in the respective section. We added
>>>>>>>> it in the acronym section as well.
>>>>>>>>
>>>>>>>> [320] Add IBN to list of acronyms
>>>>>>>>         in section 1.2.1
>>>>>>>>
>>>>>>>> <Nagendra> Ok, Done.
>>>>>>>>
>>>>>>>> [325] s/includes/include/
>>>>>>>>
>>>>>>>> <Nagendra> Done.
>>>>>>>> [359] The function/term "controller"
>>>>>>>>         requires definition.
>>>>>>>>
>>>>>>>> <Nagendra> Done, as mentioned in the above comment section.
>>>>>>>>
>>>>>>>> [383] s/?./?/
>>>>>>>>
>>>>>>>> [398] s/get the got/got/
>>>>>>>>
>>>>>>>> <Nagendra> Done.
>>>>>>>>
>>>>>>>>      [461]
>>>>>>>>         s/devices/device/
>>>>>>>>
>>>>>>>> <Nagendra> Done.
>>>>>>>>
>>>>>>>>      [469] Does it have to be equal cost multipath at the service
>>>>>>>>         layer, or could unequal cost multipath also be an option
>>>>>>>> for
>>>>>>>> load-
>>>>>> balancing?
>>>>>>>>
>>>>>>>> <Nagendra>I didn’t see any discussion specific to ECMP/UCMP in
>>>>>>>> the architecture RFC.
>>>>>>>
>>>>>>> ...FB: Hmm. I did not see that RFC7665 is only about equal cost multipath.
>>>>>>>>
>>>>>>>>      [521] Not sure whether the overlay network establishes the
>>>>>>>> service
>>>> plane.
>>>>>> Isn't
>>>>>>>>         it that the overlay network establishes connectivity for the SFC-
>> related
>>>>>>>>         functions in the service plane?
>>>>>>>>
>>>>>>>> <Nagendra> The service layer is established over the overlay
>>>>>>>> network layer. I am not sure if it is right to say overlay
>>>>>>>> network provides connectivity for service layer (.
>>>>>>>
>>>>>>> ...FB: Overlay network is one component of the service layer,
>>>>>>> isn't it. So it is
>>>>>> required but not sufficient.
>>>>>>>
>>>>>>>>
>>>>>>>> [531] s/components/component/ [545] remove
>>>>>>>>         "underlay"
>>>>>>>>
>>>>>>>> <Nagendra> Done.
>>>>>>>>
>>>>>>>> [595] s/devices/device/
>>>>>>>>
>>>>>>>> <Nagendra> Done.
>>>>>>>>
>>>>>>>> [600] s/action/an action/
>>>>>>>>
>>>>>>>> <Nagendra> Done.
>>>>>>>>
>>>>>>>> [601] Expand on
>>>>>>>>         "TTL or other means" (TTL also needs to be added to
>>>>>>>> acronyms in 1.2.1). Is
>>>>>> this
>>>>>>>>         specific to NSH? Or specific to IPv4?
>>>>>>>>
>>>>>>>> <Nagendra> TTL is listed as well-known abbrev in https://www.rfc-
>>>>>>>> editor.org/materials/abbrev.expansion.txt and so we left it as it is.
>>>>>>>> TTL in this document refers to NSH TTL field.
>>>>>>>
>>>>>>> ...FB: Let's ensure we refer to NSH TTL in this case. Given that
>>>>>>> SFC can be done
>>>>>> with other means than NSH, implicit reference to NSH might be a problem.
>>>>>>>>
>>>>>>>>      [630] Mention that for "approximation of
>>>>>>>>         packet loss for a given SFC can be derived" to be
>>>>>>>> applicable, SFC OAM
>>>>>> packets
>>>>>>>>         would need to be forwarded the same as live user traffic.
>>>>>>>>
>>>>>>>> <Nagendra> As it is intending to derive the approximate loss
>>>>>>>> value, I am not sure if we need this additional consideration
>>>>>>>> that the OAM packet would need to follow the live user traffic.
>>>>>>>> Let me know if you think
>>>>>> otherwise.
>>>>>>>
>>>>>>> ...FB: IMHO we should - given that it is one potential complication.
>>>>>>>
>>>>>>>>
>>>>>>>>      [636] Is uppercase
>>>>>>>>         "MUST" applicable to an informational document? Especially given
>> that
>>>>>>>>         RFC2119/RFC8174 is explicitly referenced by the draft.
>>>>>>>>
>>>>>>>> <Nagendra> Based on various reviewer comments, we removed the use
>>>>>>>> of any normative statement.
>>>>>>>>
>>>>>>>> [666] Add MPLS, TRILL to
>>>>>>>>         acronyms in 1.2.1
>>>>>>>>
>>>>>>>> <Nagendra> Ok. Done.
>>>>>>>>
>>>>>>>> [678] s/exhaustive/exhaustive./
>>>>>>>>
>>>>>>>> <Nagendra> Done.
>>>>>>>>
>>>>>>>> [720] Is uppercase "SHOULD" applicable to an informational document?
>>>>>>>>         Especially given that RFC2119/RFC8174 is explicitly
>>>>>>>> referenced by the
>>>>>> draft.
>>>>>>>>
>>>>>>>> <Nagendra> Based on various reviewer comments, we removed the use
>>>>>>>> of any normative statement.
>>>>>>>>
>>>>>>>> [722] Is uppercase "MAY" applicable to an informational document?
>>>>>> Especially
>>>>>>>>         given that RFC2119/RFC8174 is explicitly referenced by the draft.
>>>>>>>>
>>>>>>>> <Nagendra> Based on various reviewer comments, we removed the use
>>>>>>>> of any normative statement.
>>>>>>>>
>>>>>>>> [754]
>>>>>>>>         s/packet/packets/
>>>>>>>>
>>>>>>>> [755] s/to next node/to the next node/
>>>>>>>>
>>>>>>>>      [771] How does this
>>>>>>>>         requirement align with the earlier paragraph, e.g. in case
>>>>>>>> a node sends an ICMP
>>>>>>>>         reply? It would probably make sense to scope the statement to e.g.
>>>> NSH.
>>>>>>>>
>>>>>>>> <Nagendra> As mentioned in the statement, the node that initiates
>>>>>>>> the OAM packet must set the marker and so this statement is
>>>>>>>> applicable for the initiating node.
>>>>>>>>
>>>>>>>> [806]
>>>>>>>>         s/function/functions/
>>>>>>>>
>>>>>>>> <Nagendra> Done
>>>>>>>>
>>>>>>>> [809] s/from relevant node/from the relevant node/
>>>>>>>>
>>>>>>>> <Nagendra> Done
>>>>>>>>
>>>>>>>> [810]
>>>>>>>>         s/generate ICMP/generate an ICMP/
>>>>>>>>
>>>>>>>> <Nagendra> Done
>>>>>>>>
>>>>>>>> [812] s/from last/from the last/
>>>>>>>>
>>>>>>>> <Nagendra> Done
>>>>>>>>
>>>>>>>> [830]
>>>>>>>>         s/perform continuity/perform the continuity/
>>>>>>>>
>>>>>>>> <Nagendra> Done
>>>>>>>>
>>>>>>>>      [834] s/with relevant/with the
>>>>>>>>         relevant
>>>>>>>>
>>>>>>>> <Nagendra> Done
>>>>>>>>
>>>>>>>> [835] s/perform partial SFC availability./perform a partial SFC
>>>>>>>>         availability check./
>>>>>>>>
>>>>>>>> <Nagendra> Done
>>>>>>>>
>>>>>>>> [851] For "In-Situ OAM data fields" add a normative
>>>>>>>>         reference to draft-ietf-ippm-ioam-data
>>>>>>>>
>>>>>>>> [905] Add "CLI" to section 1.2.1
>>>>>>>>         acronyms
>>>>>>>>
>>>>>>>> <Nagendra> Done
>>>>>>>>
>>>>>>>> [920] Add a reference for NETCONF ->RFC6241
>>>>>>>>
>>>>>>>> <Nagendra> Done
>>>>>>>>
>>>>>>>> Once again, thanks a lot for the great comments.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nagendra
>>>>>>>
>>>>>>> Thanks again for considering the comments in great detail. Much
>>>> appreciated.
>>>>>>>
>>>>>>> Cheers, Frank
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>