Re: [Idr] AD Review of draft-ietf-idr-sla-exchange-08

"Shitanshu Shah (svshah)" <svshah@cisco.com> Sat, 25 June 2016 21:52 UTC

Return-Path: <svshah@cisco.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2D24012D151; Sat, 25 Jun 2016 14:52:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.965
X-Spam-Level:
X-Spam-Status: No, score=-14.965 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_FONT_FACE_BAD=0.981, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9mxF0kYvsAE2; Sat, 25 Jun 2016 14:52:45 -0700 (PDT)
Received: from alln-iport-8.cisco.com (alln-iport-8.cisco.com [173.37.142.95]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CAF1212D16C; Sat, 25 Jun 2016 14:52:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=127965; q=dns/txt; s=iport; t=1466891564; x=1468101164; h=from:to:cc:subject:date:message-id:mime-version; bh=i6NjaIPR1DMRTd1UywtUIWVdj8ldpRltO3a7BVBu8Fs=; b=SFUQwYrD2VW8Xoj7bU+0+egt4lNMq5NHXxxdntHmcxIlvc0VcWslts54 ftF6gmL6LrzKrVyT0TX0cs9bnZ3VG5ETK2i6bHX61jwcUVW7FG1JKC9KS Rtr88LloX2NhNKUOZD9p4DQ8CGDik2fcWOYySwsAne0NX9fpDeVpSLVZF Q=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0ASBQBb/G5X/5hdJa1SAQmCcE5WfQa6J?= =?us-ascii?q?IF7JIV0gSQ5EwEBAQEBAQFlJ4RNAgQaAUwEBwcSAQgtCwE/JwQBDQMdiBUOA8c?= =?us-ascii?q?OAQEBAQEBAQEBAQEBAQEBAQEBAQEBFwWJcoEDhBgBZoUcBY17hT0VhTQBhgeCN?= =?us-ascii?q?ESFN4FphFSDLnuEPoZUiSoBHwE0ggUDHIFMPy8BiAsrGH8BAQE?=
X-IronPort-AV: E=Sophos;i="5.26,528,1459814400"; d="scan'208,217";a="289909729"
Received: from rcdn-core-1.cisco.com ([173.37.93.152]) by alln-iport-8.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jun 2016 21:52:42 +0000
Received: from XCH-RTP-002.cisco.com (xch-rtp-002.cisco.com [64.101.220.142]) by rcdn-core-1.cisco.com (8.14.5/8.14.5) with ESMTP id u5PLqfAn022392 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Sat, 25 Jun 2016 21:52:41 GMT
Received: from xch-rtp-019.cisco.com (64.101.220.159) by XCH-RTP-002.cisco.com (64.101.220.142) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Sat, 25 Jun 2016 17:52:40 -0400
Received: from xch-rtp-019.cisco.com ([64.101.220.159]) by XCH-RTP-019.cisco.com ([64.101.220.159]) with mapi id 15.00.1210.000; Sat, 25 Jun 2016 17:52:40 -0400
From: "Shitanshu Shah (svshah)" <svshah@cisco.com>
To: "Alvaro Retana (aretana)" <aretana@cisco.com>, "draft-ietf-idr-sla-exchange@ietf.org" <draft-ietf-idr-sla-exchange@ietf.org>
Thread-Topic: AD Review of draft-ietf-idr-sla-exchange-08
Thread-Index: AQHRzyvlvU+08NLC/kOdoAF0zVY3yg==
Date: Sat, 25 Jun 2016 21:52:40 +0000
Message-ID: <D3940AC5.1A281%svshah@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.6.5.160527
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.24.26.4]
Content-Type: multipart/alternative; boundary="_000_D3940AC51A281svshahciscocom_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/ZshxKVVzf30TzgtfurlLxHU7VsQ>
Cc: "idr-chairs@ietf.org" <idr-chairs@ietf.org>, Susan Hares <shares@ndzh.com>, "idr@ietf.org" <idr@ietf.org>
Subject: Re: [Idr] AD Review of draft-ietf-idr-sla-exchange-08
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Jun 2016 21:52:52 -0000

Dear Alvaro,

Thank you for your review and detailed review comments..

Please find response inline ##svshah . If any review comment is missed out with a response or have been marked as yet to be responded, will come back with a response.

Thanks
Shitanshu



Dear authors:

I just finished reading this document.

I have significant concerns with the content (see my detailed comments below), and with the fact that the we're producing this document on SLAs in a routing WG.  It is "ok" [*] to carry information in BGP that is not related to routing, but in this draft we're really defining how to represent an SLA.  The only BGP-specific work is related to propagating a new transitive attribute — we don't need 30 pages for that!!

Many of my comments are keyed off these statements in the Introduction: "The Inter-domain exchange of QoS SLA exchange policy described in this document…only requires that the provider wishes to send the QoS SLA policy via BGP…to a set of receivers…The SLA negotiation and assurance is outside the scope of this document."   My interpretation is that BGP is then used as a simple transport for the SLA information and that it can't (and shouldn't!) concern itself with the interpretation, validation or enforcement of the information it is transporting.

Given that, I want to first mention 3 overarching themes that I want to see addressed (I also tried to indicate which theme the comments below relate to):

T1. The SLA Definition.  As mentioned above, it concerns me that this WG has taken on the task of defining the description of an SLA.  At a minimum, this document should have been discussed in the TSV and OPS areas, for both sanity checking and coordination.  I found in the archive some exchanges with tsvwg (from 2013, when the –01 version was published) -- in the thread, the tsvwg Chair mentions the need for coordination as they have related work.  However, the communication abruptly ends [1] -- note that the last recommendation from tsvwg was against having multiple drop thresholds in DROP_THRESHOLD (Section 3.3.2.8), but the current version still hast that — it does seem like the conversation was concluded…   Please seek out review from both tsvwg and opsawg (or at least the OPS-Dir).

##svshah, We had presented the draft to the TSVWG at the IETF meeting in the past as well had discussion with D. Black. It was concluded then that the work in this draft is not related to some of the other work items that was thought might be related. Current specification also is an outcome with review comments incorporated as a result of those discussions (for example -  use of TSPEC in one of the SLA Service Type).


T2. Who does what?  (aka which functionality belongs in BGP?)

I then see the document as containing two distinct sets of specifications: (1) how does BGP transport some information in a transitive attribute (including how a BGP speaker validates the formatting), and (2) how the user of the SLA information (which is not the BGP speaker itself, but some other consumer) should deal with it.  I would really like to see a clear definition of who (the BGP Speaker or something else) is executing the action — keeping into account that it is out of scope what exactly is done with the information.

I would like to propose a simple framework/terminology for the process, which I think is something like this:
A. the SLA_Provider (which represents whoever/whatever comes up with the SLA definition) gives the formatted information to BGP for it to transport (associated with a prefix, destined to a list of ASNs, etc.);
B. BGP puts the formatted information in the QoS attribute and advertises it in an UPDATE;
C. the BGP receiver verifies that the UPDATE is formatted correctly (including the QoS Attribute);
D. a SLA_Consumer (which represents a local process that will make sense of the content and maybe even do something with it) further verifies that the content makes sense, keeps track of the SLA ID, etc..

Right now, the document doesn't clearly assign roles and it seems like everything is done by the BGP speaker, where conceptually it isn't because BGP is a routing protocol and not an SLA_Producer or SLA_Consumer.  For example (there are more cases below), Section 5.2. (SLA Attribute Handling at Receiver) says that "the receiving BGP peer SHOULD invalidate previous advertised SLA parameters if one exists for the same SLA id and source AS."  Following the conceptual model above, this action would be done by the SLA_Consumer, who may care about the content itself.

If you don't want to use the simple framework above and want to use other names, that is perfectly fine.  I just want the document to clearly reflect what is done by BGP and what is done by someone else.

##svshah, The intention is indeed to have attribute specific content processed by Attribute owner and content outside of attribute is processed by BGP speaker. It is fair enough to define explicit terminology and explicit mention at places where it may otherwise come out vague. One change I would like to propose is to call out QoS attribute producer and consumer roles instead of SLA producer and consumer roles since that would be specific to SLA SubType only.


T3. Clarity and Consistency…  I am very disappointed in the fact that the SLA itself is underspecified, that there isn't consistency on the names used (even within the definition of the fields!) and that some sections seem to be left to the readers' interpretation (guessing!).  Examples include simple things such as the units in which the length is indicated, and using consistent names.  Those may be easy to correct, but a Standards Track document should not be in this state at this point in the process.  There are many more examples below.



I'm returning the document to the WG for further processing.  I expect that proper coordination and review with TSV and OPS, including a new WGLC with the appropriate WGs in cc (tsvwg, for example), takes place before returning to the IESG.  I recommend that other comments be addressed before embarking in reviews beyond idr.

Thanks!

Alvaro.

[*] Given the fact that we already throw all kinds of stuff on top of BGP… :-(
[1] https://mailarchive.ietf.org/arch/msg/tsvwg/IjVdEF7KZyz0czm4YBXEr7KjmkU


M1. [T1] [T2] Introduction: "The Inter-domain exchange of QoS SLA exchange policy described in this document does not require any specific method for the provider in establishing SLAs.  It only requires that the provider wishes to send the QoS SLA policy via BGP UPDATE [RFC4271] messages from the provider to a set of receivers (BGP peers) who will enact the policy.  In reaction to, a receiving router may translate that to relevant QoS policy definition on the device…  The SLA negotiation and assurance is outside the scope of this document."
##svshah, “The SLA negotiation and assurance is outside of the scope of this document” was highlighted in a paragraph after next. I’ve incorporated the statement here now as suggested and removed from the other place.


M1.1. [T2] If I interpret this text correctly, the intent of this document is to propagate the new attribute (with SLA information in it), but it takes no position as to what the receivers do with the information.  Is that correct?
##svshah, We highlight use-case and intentions of receivers, why there is a need to receive SLAs via in-band method. We do not standardize implementation of any policy on that received SLAContent.

M1.2 [T2] If so, then the "enact the policy" part above is out of place because there's no assumption as to what will happen with the information.
##svshah, removed.


M2. [T3] Section 3.1. (SLA, QoS attribute sub-type, Definition)    SubType range is not defined correctly: s/0x0f/0xff    What is the  SubType length expressed in, octets?
##svshah, addressed. Length is expressed in octets.


M3. [T3] Section 3.2. (SLA SubType Value Field)

M3.1 "Source AS…Note that AS = 0, used in message outside of QoS attribute, is illegal in normal BGP operations.  AS = 0 inside the QoS attribute may be used simply as a flag…"
M3.1.1 [minor] Please put a reference to rfc7607.
M3.1.2 Given that this document specifies a new valid use of AS 0, I think it should be marked as Updating rfc7607.
M3.1.3 What is the purpose of this field?
##svshah, the purpose of source/destination AS here is to cover the case where need is to advertise SLA to a specific AS beyond peer. AS = 0  is used as a flag. It does not seem right reason to create reverse dependency on BGP protocol specification. We can add an explicit flag field, within SLA SubType, with a bit that can indicate the same reason.


Section 4. (Originating SLA Notification) does say that the "QoS attribute MUST only be added by the originator…", but it seems easy to forge the value here.
##svshah, QoS attribute MUST only be added by the originator, however "destination AS list” needs to be updated by every recipient of an attribute.


M3.2 For the "Destination AS list" field description, it should be clearly specified the contents if the "Destination AS count" is 0.  Also, please be specific and clear (word are free!), for example: "List of ASNs for which the SLA is relevant, each of which is a 32-bit number [rfcREFERENCE].  If the Destination AS count is set to 0, then this field SHOULD NOT (?) be included in this sub-TLV….”
##svshah, addressed


M3.3 "SLA Event": the field in Figure 3 is simply called "Event".  Please be consistent: either call both the same way or clarify which field is being referenced.  I know it may be obvious, but it doesn't make for a crisp specification — other parts of the text simply use "event" as well.  Please be specific and clear as to what this field indicates.
##svshah, In the figure there is not enough space to fit “SLA Event” words in the relevant size of bits. Keeping it to “SLAEvnt” ok? Addressed it in other places.


M3.4 SLA Id
M3.4.1 "A 16-bit field containing identifier which is unique in scope of source AS"  What happens if the Source AS is set to 0?
M3.4.2 [minor] "The SLA ID applies aggregate for all the traffic…"   I'm guessing you meant to talk about the "aggregate traffic"…??
##svshah, yes. I’ve reworded to “The SLA ID applies to aggregate traffic…"

M3.4.3 "The SLA identifier is not globally unique but it MUST be unique within the source AS (advertiser)."  What happens if the Source AS field is set to 0?  The text in the Source AS field description says that when the value is 0 then "ignore Source…".  What does this do to the content of the SLA ID.
##svshah, If Source AS in the QoS attribute is 0 then sender of BGP update message is considered as source AS. Thus there is always a source in the context.

M3.4.4 [minor]  BTW, please be consistent: SLA ID, SLA id, SLA Id, SLA identifier, etc
M3.5 What is the "SLA Length" field expressed in?
##svshah, addressed


M3.6 SLA-Content per SLA Event
M3.6.1 The name seems to hint at the fact that the content is "per Event", but it is not described anywhere how multiple events would be encoded, or if even possible (see my comment above about the Event field).
##svshah, current specification has only one defined SLA Event, that is “ADVERTISE". The specification allows to define more in the future if need be. And thus being specific on the SLAContent that goes with what type of SLA Event.

M3.6.2 Is the name of this field "SLA-Content per SLA Event" or "SLA-Content"?  Please be consistent.
M3.6.3 "If the SLA-Content field does not exist…then receiver MUST inherit the previously advertised SLA content for the same SLA id from the same Source AS."   The next two paragraphs try to cover for the fact that the order cannot be guaranteed (specially more than one AS hop away).  But this all means that there is a significant possibility that a number of UPDATES will not have any SLA information at all.  Then what is the point?  When would you want to take advantage of this feature?
##svshah, SLA-Content is optional in the sense they don’t need to be repeated every time when Advertiser would like to advertise more prefixes to be subjected to the same SLA. SLA ID as an identifier is enough to advertise in such case. We already have highlighted case of non-existence prior advertised SLA-content just to cover error case (to address one of the review comments).



M4 [T3] Section 3.3. (SLA-Content per Event Field)

M4.1 [nit] s/…and then optionally followed by for the opposite direction/…and then optionally followed by the specification for the opposite direction
##svshah, addressed

M4.2 [nit] The format of the field starts with the "dir" field — is the following text really necessary: "This list of Traffic Classes MUST be specified for one direction first…"  It seems obvious to me.
##svshah, I think it is important to highlight to discourage alternating set of classes (like incoming/outgoing/incoming/outgoing/.. and so forth). I think the text is necessary for the same reason.


M4.3 "dir…indicates Direction of the SLA."  I'm guessing direction from the senders point of view, is that correct?  Please be specific.
##svshah, it is highlighted in the definition of “incoming” and “outgoing"

M4.4 "Traffic Class (Classifier Group) count…value of zero (0x00)…invalidates previous advertised SLA…"  In this case, I'm assuming that all the other fields are not needed, is that correct?  If so, then they are optional and should be described that way.
M4.5 "Class Descr Len…length of the Traffic Class Description field"   Length in what units?
##svshah, addressed

M4.6 [minor] "Traffic Class Description"  A reference for UTF-8 (rfc3629 I believe) would be nice.
##svshah, addressed


M4.7 "Elements (Classifier) Count…value zero (0x00) means…traffic class is to classify rest of the traffic not captured otherwise by other traffic classes…RECOMMENDED that Traffic Class that has 0 elements is present last…If an advertised message has it positioned somewhere else, then the receiver MUST re-order it…"   This text (as is) implies that the 0x00 classifier SHOULD be present last — is that what you meant, for it to always be present?  If so, why not use MUST and eliminate the reordering logic?  Or was the intent that if present then it should go last?
##svshah, the text intends to imply a class with no elements (not classifier 0x00). Agree with your suggestion, we can use MUST and eliminate reordering logic.


M4.8 Classifier Element TLVs:
M4.8.1 The description talks about "Classifier Element TLVs", but Figure 4 calls them "Traffic Class Elements (TLVs)".  Please be consistent.
M4.8.2 "Size of Value field…length of the value field."   Length in octets, bytes, bits, ???
##svshah, addressed

M4.8.3 [T2] "Value…It is an error if the IPFix field does not contain the appropriate format"  This statement gets into interpretation of the contents of the attribute.  If the purpose of this document is to just transport information, then why would it be a function of BGP to figure out if the contents are correct?  Please be clear about whether the BGP Speaker or SLA_Consumer should perform a function.

M4.8.4 "If a receiver receives a BGP UPDATE message with QoS/SLA attribute for an NLRI that it does not support then the receiver MUST NOT install that advertised SLA and continue to forward this attribute as an optional transitive attribute."  What does "an NLRI that it does not support" mean?  Are you talking about an NLRI from a specific unsupported AFI/SAFI, or ??
##svshah, yes, it was already qualified in the statement before, and so AFI/SAFI was not repeated. Re-worded appropriately.


M4.9 Traffic Class Service TLVs:
M4.9.1 The units for the length are (again!) not specified.  I realize that it may be in bits (given the next couple of paragraphs), but the reader shouldn't have to guess.
M4.9.2 As long as you're thinking about the length, please check all the length descriptions — I think most of them are missing the units, but this is getting tedious…
##svshah, It is specified explicitly. The size of the length field is specified as 8-bits and what size of the value field should and expressed in what is described for each service-type. Currently, list of service-types and related size of fields is repeated twice at two different places. I’ve modified content to keep it only at one place, that way all relevant details at one place.


M4.9.3 Please note that the length of the value fields is not described consistently.  For example, it says that TRAFFIC_CLASS_TSPEC is 96 bits, but 3.3.2.3 talks about octets for MINRATE_IN_PROFILE_MARKING.
##svshah, addressed

M4.9.4 "Value…It is an error if this field does not contain the appropriate format (see BGP error handling section…."   Not only is this error appropriate for the SLA_Consumer, but the "appropriate format" is not clearly identified in the description of all the TLVs.


M5 [T1] Section 3.3.1. (Supported IPFIX values for Classifier Elements)

M5.1 [minor] The correct link to the IPFIX Registry is https://www.ietf.org/assignments/ipfix/ipfix.xml#ipfix-information-elements

M5.2 "Only the IPFIX attributes listed in Table 1 are supported by BGP SLA exchange."  Why?  If this document is just about carrying a new attribute that is opaque to BGP itself and that may or may not result in an action at the receiver (see above), then why not support any IPFIX element?  Maybe not all of the elements make sense, but some surely seem to — ipClassOfService, for example.
##svshah, we already have dot1q priority covered as well diffserv and mpls-exp . Diffserv already has a support for CS0..CS7 and thus need of ToS is redundant and not needed. ipClassOfService thus is not required given those 3 fields covered.


M5.3 "Any new attribute to be supported by SLA QOS MUST be added by a Standards Action."  To be added where?  The IPFIX registry has a different registration procedure (Expert Review) -- are you referring to adding elements there?  Or adding elements (already in the IPFIX registry) to Table 1?  Or ??
##svshah, yes in Table 1. It is mentioned in the IANA Consideration section.


M6 [T1] [T3] Section 3.3.2.1. (Traffic Class TSPEC TLV) The value of this TLV "consists of the (r), (b), (p) parameters as described in Invocation Information section of [RFC2212]…".  However, the definition in the document is not the same as the one in rfc2212.  For example, this document defined r as "measured in octets of Layer 2 (L2) datagrams per second", while rfc2212 says that it is "measured in bytes of IP datagrams per second".  If the intent is to (as stated in the document) use the definitions from rfc2212, then don't redefine the terms.
##svshah, The suggestion from tsvwg during review was to use RFC2212 TSPEC for the purpose of rate specification. However, specification of those parameters need to be in L2 datagram size for the purpose of this specification unlike IP datagram size in RFC2212. What is a recommendation?



M7 [T1] [T2] [T3] Section 3.3.2.2. (Traffic Class L2 Overhead):

M7.1 [T3] Section 3.3.2.2. (Traffic Class L2 Overhead)  How is the L2 overhead defined?
##svshah, added clearer definition of L2 overhead

M7.2 [T1] [T2] "When the receiver chooses to react to an advertised SLA and if the L2 Overhead service type is present in advertised SLA, the receiver MUST use the explicit advertised L2 overhead rather than the local L2 overhead."  In the context of this document, where the use of the SLA is out of  scope, what does "react to an SLA" mean?   IOW, we can't mandate what the receiver does, or how it does it if outside the scope.
##svshah, The intention of this statement is to define semantics of the field and what it means to the receiver from semantics point of view. The purpose of this statement is to mandate on a receiver on how to interpret it. Nothing beyond that.


M7.3 [T1] "If SLA is required to consider only IP packet size, the sender MAY advertise this service with a value of 0."  Again, the reason is out of scope.  The ability to advertise a value of 0 is not.
##svshah, reworded to define only semantics of the field. Though just re-iterate, use-case and intention why we need to advertise SLA is clearly highlighted. What is not in the scope of this specification is standardizing any policy and how to implement any policy on received SLA.


M8. [T1] Section 3.3.2.7. (Precedence between MINRATE and MAXRATE)  Back to the data being opaque..why is this section included in the document?
##svshah, This is important from the perspective of defining clear semantics of advertised fields. So that it does not remain vague even from the view point of advertiser.


M9. [T3] Section 3.3.2.9. (Traffic Class for Relative Priority)  "Length = 4 bits"  Unlike all the other length issues I've pointed at, this time you seem to be indicating the length of the Value field.  Does this mean that the Length field in they TLV should be set to 4 (as in "the length of the Value field in bits"), or that the Length field should not even be present because it is always 4 bits, or something else??
##svshah, yes the length field in the TLV to be set to 4. Since length of the value, for each service type, is clearly highlighted in earlier section it is not repeated here again. I’ve collapsed content from both the places to one now to make it clearer for the reading, as well addressed any consistency issues.


M10 Section 4.1.2. (SLA Advertisement for Destination AS Multiple Hops Away) "If a new prefix is added in an AS for which SLA parameters have already been advertised before for other existing prefixes, and if traffic to this new prefix is subject to the same SLA advertised earlier, then the BGP UPDATE for this new prefix may include QoS attribute containing just an SLA id…"

M10.1 Is this behavior also valid for directly connected peers?
##svshah, Semantically yes it is.

M10.2 The description above assumes that the receiver has the SLA description from a previous UPDATE.  See my comments above about Section 3.2.
##svshah, the description highlights to advertise whole SLA content if it is first time advertised to a set of recipients, or else advertising just SLA ID is sufficient for the subsequent additional prefixes for the same recipients.



M11. [T2] Section 5.1. (BGP Node Capable of Processing QoS Attribute)

M11.1 Back to the opaqueness of the attribute from the BGP point of view and how the use of the information is out of scope.  What does "processing" mean in this context?  I think you will need to separate the processing done by the BGP Speaker from what the SLA_Consumer should do.
##svshah,  I would have thought that processing of any content of an attribute (either QoS or any other attribute), is always attribute owner’s responsibility. I can see your point though that places with grey area needs explicit clarification. Let me re-work content to separate the processing done by BGP Speaker from that to be of QoS attribute.



M11.2 "If UPDATE has a QoS Attribute with a list of destination ASes, it MAY trim the list and adjust the count of the destination AS to exclude ones that are not required in further announcement of BGP UPDATE messages."  Just to make sure I'm looking at this from the right place: this text applies to a BGP speaker that received the QoS attribute from someone else and is now getting ready to propagate it, right?  If so, isn't changing the list of destination ASes basically an attack on the intent of the origin?
##svshah, would that not be true for any attributes supported today? It is clarified though in the Security consideration section.


M11.3 "BGP peer MUST drop SLA related sub-type from the QoS attribute, if there are no more ASes from the QoS Attribute's destination list in the forwarding path.  The rest of the QoS attribute contents MAY be Forwarded…”
M11.3.1 What is the "forwarding path"?  I'm guessing that means something like: if the AS is not connected to any other AS on the list (except the one it just received the UPDATE from)… What is it and how can the ingress BGP speaker in the AS figure it out?
##svshah, yet to respond


M11.3.2 [minor] Please be consistent with the naming.  I'm assuming that "SLA related sub-type" is really the SLA sub-type value 0x01…  Even though they are not defined yet, other sub-types may also refer to SLAs.
##svshah, correct

M11.3.3 The "MAY" above is normative in a "forward" manner: affects potential future SubTypes.  If new SubTypes are defined later, let them define the behavior.  s/MAY/may
##svshah, addressed

M11.3.4 [minor] Note that this whole paragraph is repetitive, please simplify it.
##svshah, I think you may be referring to beginning of the paragraph referring to dropping of SLA subtype from the QoS attribute and remaining of the paragraph referring to dropping entire QoS attribute. Both of them are separate functions and thus need to specify both. Or did you mean to point something else?



M12. [T1] [T2] [T3] Section 5.2. (SLA Attribute Handling at Receiver)

M12.1 "Reception of and processing of advertised QoS SLA content are optional for the receiver."  What do you mean that "reception" is optional?  What is "processing"?  Given that the use of the SLA is outside the scope, I can only guess that "processing" means at least parsing the contents (not actually doing anything with them).  If so, what do you mean by saying that "processing" is optional?   In general, if an attribute is sent inside an UPDATE it then has to be "received" (unless all the NLRI contents are filtered by policy) and "processed".  This part sounds like it belongs with the SLA_Consumer.
##svshah, ok

M12.2 "the receiving BGP peer SHOULD invalidate previous advertised SLA parameters if one exists for the same SLA id and source AS."  I think you really want the SLA_Consumer (not the BGP peer).
##svshah, ok

  What does "invalidate" mean in this context?  Maybe replace or an indication what the new version, etc.
##svshah, “invalidate” here means remove earlier advertised SLA content and context


M12.3 "If the new advertised SLA has a non-zero traffic count…"  "traffic count" is not defined anywhere in the document…  I guess you mean the "Traffic Class count" -- please be clear and specific.
##svshah, yeah its typo. The very next sentence is referred with Traffic Class count

M12.4 "…then the new advertised SLA SHOULD be installed"   Installed?  This sounds like you're specifying what to do with the SLA specification…outside the scope!   Even for the SLA_Consumer…

M12.5 "When BGP UPDATE messages are triggered only as a result of SLA policy change…the receiver MAY implement a policy to filter such messages based on the prefix and attribute."  I appreciate the intent, but what about changes in other attributes?
##svshah, This is purely administrative. What this suggest is that if administratively known that SLAs advertised from specific AS (or group of AS)  are only as a result of SLA policy change and thus BGP UPDATE messages, containing QOS attribute with SLA SubType, only means such. In this case BGP UPDATE messages are not expected to carry any other useful information or any other attributes besides QoS attribute with SLA subType. In this case a policy can filter for relevant prefixes and QoS attribute to drop such messages.


  IOW, the receiver doesn't know that the change was only due to an SLA policy change (which from the BGP point of view may be better called something like "an update in the QoS attribute").  Please clarify that if the filter is put in place, all the other attributes need to be checked as well.  BTW, I think that the "MAY" should be "may", but that is minor.

M12.6 "…the receiver MAY implement advertised SLA for the whole link…"  This is an indication of what to do with the SLA…out of scope!

M12.7 "…QoS attribute with the SLA…"  To be complete, I think you mean "with the SLA SubType field”.
##svshah, will address

M12.8 "…then receiver may establish advertised SLA for that specific prefix list under the relevant link."  Even thought not in a normative way, this text also talks about the use of the SLA…out of scope!

M12.9 "It is completely up to the receiver to decide for which prefixes it should accept advertised SLA and for which ones it will not accept."  Which receiver?  I'm guessing the SLA_Consumer, because the BGP speaker has no choice but to accept any attribute…  Are you implying here that the BGP speaker should be able to selectively filter QoS Attributes?
##svshah, will rephrase to make it clearer and with BGP speaker vs QoS attribute processing



M13. Section 6. (Error Handling)

M13.1 [T2] What is an "error condition"?  I counted 4 places where an error was explicitly identified — surely there are more cases that would result in an "error condition".  I'm assuming that a malformed attribute (as in the length not matching) is an obvious one (no need to explicitly call that out, but maybe a reference to the general attribute processing in rfc4271 would be good)…
##svshah, I understand reference to rfc7606 would imply reference to rfc4271?


but what about if the Source AS is set to 0 and the Destination AS count is not 0, is that an error?
##svshah, that is not an error as per current specification that suggests that when source AS is 0, ignore destination AS count and list.


M13.2 [T2] OTOH, errors on general attribute format checking (do the lengths/counts match, etc.) is where attribute-discard applies for the BGP Speaker…other errors (including some of the ones explicitly identified in the text) may be more in the domain of the SLA_Consumer.  Note that even though the SLA_Consumer may decide to not use the SLA (which is equivalent to attribute discard), I think some of the checking (formatting of the values, for example) doesn't belong to the BGP Speaker.
##svshah, sure, that was the thought process that in general for any attribute specific processing and error handling belongs to attribute owner. It may just so happen that most of the attributes are inherently part of BGP specification and thus may not be need for explicit about such separation. I’ll address making it appropriately clear in the specification.

M13.3 [nit] s/attribute-discard/attribute discard
##svshah, addressed



M14. Section 7. (Traffic Class Mapping)  What is done with the SLA is out of scope!  The mapping, or not, of policies is also out of scope — specially the normative language included in the section.  Recommendation: delete this section completely.
##svshah,  that may be fine. This may serve as a guidance though for implementations that may choose to implement a policy on received SLA. While we remove this section, shall we keep couple of sentences some-where on such?


M15. Section 11. (Security Considerations)

M15.1 "The QOS attribute defined in this document SHOULD be used by the managed networks for enforcing Quality of Service policies…"  The use of SLA information is out of scope!
##svshah, removed “enforcing Quality of Service policies”.

M15.2 "…so there should not be any risks for identity thefts."  What does that mean?  Identity theft?  Are you talking about the possibility of hijacking?  (I'm just guessing because the next sentence talks about Origin Validation.)
##svshah, some-one advertising SLA pretending to be specific source AS, and yeah thus Origin Validation.


M15.3 "To strengthen the security for the QoS attribute, RPKI based origin validation [RFC7115] MAY be used."  How does Origin Validation strengthen the security of the QoS attribute?  Origin Validation can help the receiver verify that the origin AS is in fact authorized to advertise an NLRI,
##svshah, yes, that is what it intends to achieve.

but not that the valid prefix origin did in fact also originate the QoS attribute.  Section 4. (Originating SLA Notification) does say that  the "QoS attribute…MUST NOT be added during BGP propagation", but there doesn't seem to be a way to detect non-compliance making it an important security consideration.
##svshah, yet to look at it and respond accordingly..

M15.4 "…BGP Path Validation (e.g., [I-D.ietf-sidr-bgpsec-protocol]) procedures could be used over BGP QoS attribute and its associated prefix in producing the digital signature that can be carried within the signature SLA for the messages."  What is the "signature SLA"?  This text is proposing to change BGPSec — as is, this proposal presents no assurances over the *current* risk of a MITM (until maybe someday the proposal is specified and implemented).  I would rather you clearly identify why there is a risk of MITM — and then say that there is no solution (which is the same as for other transitive attributes).
##svshah, ok

M15.5 Speaking of MITM, what about privacy?  Even if the MITM doesn't change the QoS attribute it can still glean a lot of useful information from just forwarding the UPDATES.  Please include some text about privacy.  Please take a look at RFC6973.
##svshah, ok



M16. [T3] Section 10. (IANA Considerations)

M16.1 There is no guidance about the QoS Attr flags.  I know they are not used in this document, but because it is the document that defines them it would be good to have a registration criteria.
##svshah, since there is no usage of any bits today and we have this in anticipation of possible future usage but since there is no awareness of any possible future usage, we thought it may not be something useful to track it as IANA registry yet. Do you think shall we still have it?


M16.2 [minor] I know the space is small, but the SubTypes and SLA Event Types have a registration procedure of Standards Action.  I wonder if having a value (maybe just one) for Private Use could be beneficial to allow people to put their own stuff in there…
##svshah, it is a great suggestion. that indeed can be very useful. addressed.

M16.3 [minor] This sentence seems out of place since it provides no guidance to IANA: "QoS SLA Traffic Class Element Types will be referring to existing IPFIX IANA types as listed in Table 1.”
##svshah, should I keep it here though for completeness to the reader?




Minor:

m1. [T1] The Introduction talks about the fact that other "protocols and data models are being created to standardize setting routing configuration parameters within networks.  YANG data models…the I2RS protocol", but that "some providers desire to exchange QoS parameters in-band utilizing BGP peering relationships."  Two related questions:

m1.1 Is there specific YANG/i2rs work in process that matches what is defined here?  How are the models similar/different?  Note that this is not a question about comparing the use of YANG models, the i2rs protocol and the mechanism using BGP  -- I just want to know if there are other similar efforts in progress and how they are in sync or not.  The underlying question is really about whether an operator will be able to use a NETCONF/i2rs to configure the SLA information received in a BGP session (as specified in this document).  I know that anything is possible, but is that the direction that is being taken today?
##svshah, QoS Yang Data Model is quite early in works which is submitted as individual draft as of now in the netmod wg. Though currently they seem to be aligned, it is quite early to say (I’ve been involved in the QoS Yang Data Model review). Though, the statement here regarding yang/i2rs work was added specifically as a result of review comment from the wg on this draft.

m1.2 Are there implementations?  It would be nice to add a section ala rfc6982 and point to the idr wiki from there.
##svshah, I am not aware of any implementation of Yang models.


m2. [T3] The new attribute is called by several different names in the document: "QoS Attribute", "QoS/SLA Attribute", "QoS BGP attribute", and simply "attribute" in a couple of cases.  Please be consistent!!!
##svshah, addressed


m3.  Section 3.3.2.3. (Traffic Class for MINRATE_IN_PROFILE_MARKING)  I'm not sure how to interpret this:  "Marking code-point type = 8 bits (1 octet) IPFIX Element Identifier…The marking code-point type of 0x00 is a drop identifier; the length in this case is zero."   IPFIX has Element ID 0 as reserved.  I'm guessing that you mean that if the code-point is 0x00, to better yet, if the length is 0, then drop.   ???
##svshah, it should read “marking code-point value in this case has no meaning and thus the value in this filed should be ignore”.

m3.1 Same comment for 3.3.2.4, 3.3.2.5 and 3.3.2.6
##svshah, addressed


m4. For all the sub-sections in 3.3.2, please put in a short explanation of what each TLV represents.
##svshah, ok


m5. Section 8. (Deployment Considerations)  I think this section should really be called "Use Cases" and moved earlier in the document, maybe even as part of the Introduction.  It would be very nice if the case for propagating the QoS Attribute through multiple AS hops was also illustrated.
##svshah, yet to be addressed.


m6. References

m6.1 RFC3270 should be Informative
m6.2 If RFC7115 is referenced as Normative, then so should I-D.ietf-sidr-bgpsec-protocol because you're offering them as solutions to security issues.  Please take a look at my comments on the Security Considerations Section above.
##svshah, addressed

m6.3 For the illustration in the document, I don't think the reference to RFC7674 is needed.
##svshah, it has been referred in the Introduction. That is why added in the reference.




Nits:

n1. s/…and then optionally followed by for the opposite direction/…and then optionally followed by the specification for the opposite direction
n2. Please expand on first use: TSpec
##svshah, addressed

n3. In 3.3.2.3, please be explicit that the Value contains the Marking code-point type and value.

n4. In 4.1, "next hop to CE (that is the next hop address from CE to PE)"…shouldn't that be "next hop to PE”?
##svshah, removed “next hop to CE” and used wording from the parentheses.