[Idr] AD Review of draft-ietf-idr-sla-exchange-08

"Alvaro Retana (aretana)" <aretana@cisco.com> Thu, 09 June 2016 23:43 UTC

Return-Path: <aretana@cisco.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D7CD612D7E8; Thu, 9 Jun 2016 16:43:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -15.946
X-Spam-Level:
X-Spam-Status: No, score=-15.946 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3bp5Zg8239J7; Thu, 9 Jun 2016 16:43:32 -0700 (PDT)
Received: from rcdn-iport-8.cisco.com (rcdn-iport-8.cisco.com [173.37.86.79]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AFB0112B057; Thu, 9 Jun 2016 16:43:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=93783; q=dns/txt; s=iport; t=1465515811; x=1466725411; h=from:to:cc:subject:date:message-id:mime-version; bh=wvxhQi3pSLWW39ditma4kPI7aPFx7C5clEl5NYpKOk0=; b=XaWWm+o5q1Ed2abKWX8sfFslaf43lOaSEMTSTmMOzgjTk8BWg5Dn9Sv0 zFIayBKQEoM0ObiBwHxMLbdE/HxI6tqoXeGFz1G8bs2hTp72ZPgu36GP3 FtgPCK3n0s4OYuZZusrjq6YZx4dXWGaIWCRxzh0BYOmh3MKRGEzHQam2f E=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0AgBQDs/VlX/5FdJa1TAQmCcE5WfQa7FoF6JIVvgTo5EwEBAQEBAQFlHAuETBoBTAQHBxIBNQsBPycEDgMdiBQOA75DAQEBAQEBAQMBAQEBAQEBAQEZBYYniGUBZoUbBY1qhVKFGQGGAoJ4hSyBaYRSiGWGQIkkASABM4IEAxyBSz8vAYhFKxh/AQEB
X-IronPort-AV: E=Sophos;i="5.26,447,1459814400"; d="scan'208,217";a="111717769"
Received: from rcdn-core-9.cisco.com ([173.37.93.145]) by rcdn-iport-8.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Jun 2016 23:43:29 +0000
Received: from XCH-ALN-004.cisco.com (xch-aln-004.cisco.com [173.36.7.14]) by rcdn-core-9.cisco.com (8.14.5/8.14.5) with ESMTP id u59NhTe5003334 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Thu, 9 Jun 2016 23:43:29 GMT
Received: from xch-aln-002.cisco.com (173.36.7.12) by XCH-ALN-004.cisco.com (173.36.7.14) with Microsoft SMTP Server (TLS) id 15.0.1104.5; Thu, 9 Jun 2016 18:43:28 -0500
Received: from xch-aln-002.cisco.com ([173.36.7.12]) by XCH-ALN-002.cisco.com ([173.36.7.12]) with mapi id 15.00.1104.009; Thu, 9 Jun 2016 18:43:28 -0500
From: "Alvaro Retana (aretana)" <aretana@cisco.com>
To: "draft-ietf-idr-sla-exchange@ietf.org" <draft-ietf-idr-sla-exchange@ietf.org>
Thread-Topic: AD Review of draft-ietf-idr-sla-exchange-08
Thread-Index: AQHRwqi5NZ7Eq+drE06N6iB8qKo/Lg==
Date: Thu, 09 Jun 2016 23:43:28 +0000
Message-ID: <D343FE1E.121ABD%aretana@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.6.2.160219
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.117.15.4]
Content-Type: multipart/alternative; boundary="_000_D343FE1E121ABDaretanaciscocom_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/Qe8yJX3xtyQN3OhD-Zk5eprfxRw>
Cc: "idr-chairs@ietf.org" <idr-chairs@ietf.org>, Susan Hares <shares@ndzh.com>, "idr@ietf.org" <idr@ietf.org>
Subject: [Idr] AD Review of draft-ietf-idr-sla-exchange-08
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Jun 2016 23:43:37 -0000

Dear authors:

I just finished reading this document.

I have significant concerns with the content (see my detailed comments below), and with the fact that the we're producing this document on SLAs in a routing WG.  It is "ok" [*] to carry information in BGP that is not related to routing, but in this draft we're really defining how to represent an SLA.  The only BGP-specific work is related to propagating a new transitive attribute — we don't need 30 pages for that!!

Many of my comments are keyed off these statements in the Introduction: "The Inter-domain exchange of QoS SLA exchange policy described in this document…only requires that the provider wishes to send the QoS SLA policy via BGP…to a set of receivers…The SLA negotiation and assurance is outside the scope of this document."   My interpretation is that BGP is then used as a simple transport for the SLA information and that it can't (and shouldn't!) concern itself with the interpretation, validation or enforcement of the information it is transporting.

Given that, I want to first mention 3 overarching themes that I want to see addressed (I also tried to indicate which theme the comments below relate to):

T1. The SLA Definition.  As mentioned above, it concerns me that this WG has taken on the task of defining the description of an SLA.  At a minimum, this document should have been discussed in the TSV and OPS areas, for both sanity checking and coordination.  I found in the archive some exchanges with tsvwg (from 2013, when the –01 version was published) -- in the thread, the tsvwg Chair mentions the need for coordination as they have related work.  However, the communication abruptly ends [1] -- note that the last recommendation from tsvwg was against having multiple drop thresholds in DROP_THRESHOLD (Section 3.3.2.8), but the current version still hast that — it does seem like the conversation was concluded…   Please seek out review from both tsvwg and opsawg (or at least the OPS-Dir).


T2. Who does what?  (aka which functionality belongs in BGP?)

I then see the document as containing two distinct sets of specifications: (1) how does BGP transport some information in a transitive attribute (including how a BGP speaker validates the formatting), and (2) how the user of the SLA information (which is not the BGP speaker itself, but some other consumer) should deal with it.  I would really like to see a clear definition of who (the BGP Speaker or something else) is executing the action — keeping into account that it is out of scope what exactly is done with the information.

I would like to propose a simple framework/terminology for the process, which I think is something like this:
A. the SLA_Provider (which represents whoever/whatever comes up with the SLA definition) gives the formatted information to BGP for it to transport (associated with a prefix, destined to a list of ASNs, etc.);
B. BGP puts the formatted information in the QoS attribute and advertises it in an UPDATE;
C. the BGP receiver verifies that the UPDATE is formatted correctly (including the QoS Attribute);
D. a SLA_Consumer (which represents a local process that will make sense of the content and maybe even do something with it) further verifies that the content makes sense, keeps track of the SLA ID, etc..

Right now, the document doesn't clearly assign roles and it seems like everything is done by the BGP speaker, where conceptually it isn't because BGP is a routing protocol and not an SLA_Producer or SLA_Consumer.  For example (there are more cases below), Section 5.2. (SLA Attribute Handling at Receiver) says that "the receiving BGP peer SHOULD invalidate previous advertised SLA parameters if one exists for the same SLA id and source AS."  Following the conceptual model above, this action would be done by the SLA_Consumer, who may care about the content itself.

If you don't want to use the simple framework above and want to use other names, that is perfectly fine.  I just want the document to clearly reflect what is done by BGP and what is done by someone else.


T3. Clarity and Consistency…  I am very disappointed in the fact that the SLA itself is underspecified, that there isn't consistency on the names used (even within the definition of the fields!) and that some sections seem to be left to the readers' interpretation (guessing!).  Examples include simple things such as the units in which the length is indicated, and using consistent names.  Those may be easy to correct, but a Standards Track document should not be in this state at this point in the process.  There are many more examples below.



I'm returning the document to the WG for further processing.  I expect that proper coordination and review with TSV and OPS, including a new WGLC with the appropriate WGs in cc (tsvwg, for example), takes place before returning to the IESG.  I recommend that other comments be addressed before embarking in reviews beyond idr.

Thanks!

Alvaro.

[*] Given the fact that we already throw all kinds of stuff on top of BGP… :-(
[1] https://mailarchive.ietf.org/arch/msg/tsvwg/IjVdEF7KZyz0czm4YBXEr7KjmkU



Major:

M1. [T1] [T2] Introduction: "The Inter-domain exchange of QoS SLA exchange policy described in this document does not require any specific method for the provider in establishing SLAs.  It only requires that the provider wishes to send the QoS SLA policy via BGP UPDATE [RFC4271] messages from the provider to a set of receivers (BGP peers) who will enact the policy.  In reaction to, a receiving router may translate that to relevant QoS policy definition on the device…  The SLA negotiation and assurance is outside the scope of this document."

M1.1. [T2] If I interpret this text correctly, the intent of this document is to propagate the new attribute (with SLA information in it), but it takes no position as to what the receivers do with the information.  Is that correct?

M1.2 [T2] If so, then the "enact the policy" part above is out of place because there's no assumption as to what will happen with the information.



M2. [T3] Section 3.1. (SLA, QoS attribute sub-type, Definition)    SubType range is not defined correctly: s/0x0f/0xff    What is the  SubType length expressed in, octets?



M3. [T3] Section 3.2. (SLA SubType Value Field)

M3.1 "Source AS…Note that AS = 0, used in message outside of QoS attribute, is illegal in normal BGP operations.  AS = 0 inside the QoS attribute may be used simply as a flag…"
M3.1.1 [minor] Please put a reference to rfc7607.
M3.1.2 Given that this document specifies a new valid use of AS 0, I think it should be marked as Updating rfc7607.
M3.1.3 What is the purpose of this field?  Section 4. (Originating SLA Notification) does say that the "QoS attribute MUST only be added by the originator…", but it seems easy to forge the value here.

M3.2 For the "Destination AS list" field description, it should be clearly specified the contents if the "Destination AS count" is 0.  Also, please be specific and clear (word are free!), for example: "List of ASNs for which the SLA is relevant, each of which is a 32-bit number [rfcREFERENCE].  If the Destination AS count is set to 0, then this field SHOULD NOT (?) be included in this sub-TLV…."

M3.3 "SLA Event": the field in Figure 3 is simply called "Event".  Please be consistent: either call both the same way or clarify which field is being referenced.  I know it may be obvious, but it doesn't make for a crisp specification — other parts of the text simply use "event" as well.  Please be specific and clear as to what this field indicates.

M3.4 SLA Id
M3.4.1 "A 16-bit field containing identifier which is unique in scope of source AS"  What happens if the Source AS is set to 0?
M3.4.2 [minor] "The SLA ID applies aggregate for all the traffic…"   I'm guessing you meant to talk about the "aggregate traffic"…??
M3.4.3 "The SLA identifier is not globally unique but it MUST be unique within the source AS (advertiser)."  What happens if the Source AS field is set to 0?  The text in the Source AS field description says that when the value is 0 then "ignore Source…".  What does this do to the content of the SLA ID.
M3.4.4 [minor]  BTW, please be consistent: SLA ID, SLA id, SLA Id, SLA identifier, etc

M3.5 What is the "SLA Length" field expressed in?

M3.6 SLA-Content per SLA Event
M3.6.1 The name seems to hint at the fact that the content is "per Event", but it is not described anywhere how multiple events would be encoded, or if even possible (see my comment above about the Event field).
M3.6.2 Is the name of this field "SLA-Content per SLA Event" or "SLA-Content"?  Please be consistent.
M3.6.3 "If the SLA-Content field does not exist…then receiver MUST inherit the previously advertised SLA content for the same SLA id from the same Source AS."   The next two paragraphs try to cover for the fact that the order cannot be guaranteed (specially more than one AS hop away).  But this all means that there is a significant possibility that a number of UPDATES will not have any SLA information at all.  Then what is the point?  When would you want to take advantage of this feature?



M4 [T3] Section 3.3. (SLA-Content per Event Field)

M4.1 [nit] s/…and then optionally followed by for the opposite direction/…and then optionally followed by the specification for the opposite direction

M4.2 [nit] The format of the field starts with the "dir" field — is the following text really necessary: "This list of Traffic Classes MUST be specified for one direction first…"  It seems obvious to me.

M4.3 "dir…indicates Direction of the SLA."  I'm guessing direction from the senders point of view, is that correct?  Please be specific.

M4.4 "Traffic Class (Classifier Group) count…value of zero (0x00)…invalidates previous advertised SLA…"  In this case, I'm assuming that all the other fields are not needed, is that correct?  If so, then they are optional and should be described that way.

M4.5 "Class Descr Len…length of the Traffic Class Description field"   Length in what units?

M4.6 [minor] "Traffic Class Description"  A reference for UTF-8 (rfc3629 I believe) would be nice.

M4.7 "Elements (Classifier) Count…value zero (0x00) means…traffic class is to classify rest of the traffic not captured otherwise by other traffic classes…RECOMMENDED that Traffic Class that has 0 elements is present last…If an advertised message has it positioned somewhere else, then the receiver MUST re-order it…"   This text (as is) implies that the 0x00 classifier SHOULD be present last — is that what you meant, for it to always be present?  If so, why not use MUST and eliminate the reordering logic?  Or was the intent that if present then it should go last?

M4.8 Classifier Element TLVs:
M4.8.1 The description talks about "Classifier Element TLVs", but Figure 4 calls them "Traffic Class Elements (TLVs)".  Please be consistent.
M4.8.2 "Size of Value field…length of the value field."   Length in octets, bytes, bits, ???
M4.8.3 [T2] "Value…It is an error if the IPFix field does not contain the appropriate format"  This statement gets into interpretation of the contents of the attribute.  If the purpose of this document is to just transport information, then why would it be a function of BGP to figure out if the contents are correct?  Please be clear about whether the BGP Speaker or SLA_Consumer should perform a function.
M4.8.4 "If a receiver receives a BGP UPDATE message with QoS/SLA attribute for an NLRI that it does not support then the receiver MUST NOT install that advertised SLA and continue to forward this attribute as an optional transitive attribute."  What does "an NLRI that it does not support" mean?  Are you talking about an NLRI from a specific unsupported AFI/SAFI, or ??

M4.9 Traffic Class Service TLVs:
M4.9.1 The units for the length are (again!) not specified.  I realize that it may be in bits (given the next couple of paragraphs), but the reader shouldn't have to guess.
M4.9.2 As long as you're thinking about the length, please check all the length descriptions — I think most of them are missing the units, but this is getting tedious…
M4.9.3 Please note that the length of the value fields is not described consistently.  For example, it says that TRAFFIC_CLASS_TSPEC is 96 bits, but 3.3.2.3 talks about octets for MINRATE_IN_PROFILE_MARKING.
M4.9.4 "Value…It is an error if this field does not contain the appropriate format (see BGP error handling section…."   Not only is this error appropriate for the SLA_Consumer, but the "appropriate format" is not clearly identified in the description of all the TLVs.



M5 [T1] Section 3.3.1. (Supported IPFIX values for Classifier Elements)

M5.1 [minor] The correct link to the IPFIX Registry is https://www.ietf.org/assignments/ipfix/ipfix.xml#ipfix-information-elements

M5.2 "Only the IPFIX attributes listed in Table 1 are supported by BGP SLA exchange."  Why?  If this document is just about carrying a new attribute that is opaque to BGP itself and that may or may not result in an action at the receiver (see above), then why not support any IPFIX element?  Maybe not all of the elements make sense, but some surely seem to — ipClassOfService, for example.

M5.3 "Any new attribute to be supported by SLA QOS MUST be added by a Standards Action."  To be added where?  The IPFIX registry has a different registration procedure (Expert Review) -- are you referring to adding elements there?  Or adding elements (already in the IPFIX registry) to Table 1?  Or ??



M6 [T1] [T3] Section 3.3.2.1. (Traffic Class TSPEC TLV) The value of this TLV "consists of the (r), (b), (p) parameters as described in Invocation Information section of [RFC2212]…".  However, the definition in the document is not the same as the one in rfc2212.  For example, this document defined r as "measured in octets of Layer 2 (L2) datagrams per second", while rfc2212 says that it is "measured in bytes of IP datagrams per second".  If the intent is to (as stated in the document) use the definitions from rfc2212, then don't redefine the terms.



M7 [T1] [T2] [T3] Section 3.3.2.2. (Traffic Class L2 Overhead):

M7.1 [T3] Section 3.3.2.2. (Traffic Class L2 Overhead)  How is the L2 overhead defined?

M7.2 [T1] [T2] "When the receiver chooses to react to an advertised SLA and if the L2 Overhead service type is present in advertised SLA, the receiver MUST use the explicit advertised L2 overhead rather than the local L2 overhead."  In the context of this document, where the use of the SLA is out of  scope, what does "react to an SLA" mean?   IOW, we can't mandate what the receiver does, or how it does it if outside the scope.

M7.3 [T1] "If SLA is required to consider only IP packet size, the sender MAY advertise this service with a value of 0."  Again, the reason is out of scope.  The ability to advertise a value of 0 is not.



M8. [T1] Section 3.3.2.7. (Precedence between MINRATE and MAXRATE)  Back to the data being opaque..why is this section included in the document?



M9. [T3] Section 3.3.2.9. (Traffic Class for Relative Priority)  "Length = 4 bits"  Unlike all the other length issues I've pointed at, this time you seem to be indicating the length of the Value field.  Does this mean that the Length field in they TLV should be set to 4 (as in "the length of the Value field in bits"), or that the Length field should not even be present because it is always 4 bits, or something else??



M10 Section 4.1.2. (SLA Advertisement for Destination AS Multiple Hops Away) "If a new prefix is added in an AS for which SLA parameters have already been advertised before for other existing prefixes, and if traffic to this new prefix is subject to the same SLA advertised earlier, then the BGP UPDATE for this new prefix may include QoS attribute containing just an SLA id…"

M10.1 Is this behavior also valid for directly connected peers?

M10.2 The description above assumes that the receiver has the SLA description from a previous UPDATE.  See my comments above about Section 3.2.



M11. [T2] Section 5.1. (BGP Node Capable of Processing QoS Attribute)

M11.1 Back to the opaqueness of the attribute from the BGP point of view and how the use of the information is out of scope.  What does "processing" mean in this context?  I think you will need to separate the processing done by the BGP Speaker from what the SLA_Consumer should do.

M11.2 "If UPDATE has a QoS Attribute with a list of destination ASes, it MAY trim the list and adjust the count of the destination AS to exclude ones that are not required in further announcement of BGP UPDATE messages."  Just to make sure I'm looking at this from the right place: this text applies to a BGP speaker that received the QoS attribute from someone else and is now getting ready to propagate it, right?  If so, isn't changing the list of destination ASes basically an attack on the intent of the origin?

M11.3 "BGP peer MUST drop SLA related sub-type from the QoS attribute, if there are no more ASes from the QoS Attribute's destination list in the forwarding path.  The rest of the QoS attribute contents MAY be Forwarded…"
M11.3.1 What is the "forwarding path"?  I'm guessing that means something like: if the AS is not connected to any other AS on the list (except the one it just received the UPDATE from)… What is it and how can the ingress BGP speaker in the AS figure it out?
M11.3.2 [minor] Please be consistent with the naming.  I'm assuming that "SLA related sub-type" is really the SLA sub-type value 0x01…  Even though they are not defined yet, other sub-types may also refer to SLAs.
M11.3.3 The "MAY" above is normative in a "forward" manner: affects potential future SubTypes.  If new SubTypes are defined later, let them define the behavior.  s/MAY/may
M11.3.4 [minor] Note that this whole paragraph is repetitive, please simplify it.



M12. [T1] [T2] [T3] Section 5.2. (SLA Attribute Handling at Receiver)

M12.1 "Reception of and processing of advertised QoS SLA content are optional for the receiver."  What do you mean that "reception" is optional?  What is "processing"?  Given that the use of the SLA is outside the scope, I can only guess that "processing" means at least parsing the contents (not actually doing anything with them).  If so, what do you mean by saying that "processing" is optional?   In general, if an attribute is sent inside an UPDATE it then has to be "received" (unless all the NLRI contents are filtered by policy) and "processed".  This part sounds like it belongs with the SLA_Consumer.

M12.2 "the receiving BGP peer SHOULD invalidate previous advertised SLA parameters if one exists for the same SLA id and source AS."  I think you really want the SLA_Consumer (not the BGP peer).  What does "invalidate" mean in this context?  Maybe replace or an indication what the new version, etc.

M12.3 "If the new advertised SLA has a non-zero traffic count…"  "traffic count" is not defined anywhere in the document…  I guess you mean the "Traffic Class count" -- please be clear and specific.

M12.4 "…then the new advertised SLA SHOULD be installed"   Installed?  This sounds like you're specifying what to do with the SLA specification…outside the scope!   Even for the SLA_Consumer…

M12.5 "When BGP UPDATE messages are triggered only as a result of SLA policy change…the receiver MAY implement a policy to filter such messages based on the prefix and attribute."  I appreciate the intent, but what about changes in other attributes?  IOW, the receiver doesn't know that the change was only due to an SLA policy change (which from the BGP point of view may be better called something like "an update in the QoS attribute").  Please clarify that if the filter is put in place, all the other attributes need to be checked as well.  BTW, I think that the "MAY" should be "may", but that is minor.

M12.6 "…the receiver MAY implement advertised SLA for the whole link…"  This is an indication of what to do with the SLA…out of scope!

M12.7 "…QoS attribute with the SLA…"  To be complete, I think you mean "with the SLA SubType field".

M12.8 "…then receiver may establish advertised SLA for that specific prefix list under the relevant link."  Even thought not in a normative way, this text also talks about the use of the SLA…out of scope!

M12.9 "It is completely up to the receiver to decide for which prefixes it should accept advertised SLA and for which ones it will not accept."  Which receiver?  I'm guessing the SLA_Consumer, because the BGP speaker has no choice but to accept any attribute…  Are you implying here that the BGP speaker should be able to selectively filter QoS Attributes?



M13. Section 6. (Error Handling)

M13.1 [T2] What is an "error condition"?  I counted 4 places where an error was explicitly identified — surely there are more cases that would result in an "error condition".  I'm assuming that a malformed attribute (as in the length not matching) is an obvious one (no need to explicitly call that out, but maybe a reference to the general attribute processing in rfc4271 would be good)…but what about if the Source AS is set to 0 and the Destination AS count is not 0, is that an error?

M13.2 [T2] OTOH, errors on general attribute format checking (do the lengths/counts match, etc.) is where attribute-discard applies for the BGP Speaker…other errors (including some of the ones explicitly identified in the text) may be more in the domain of the SLA_Consumer.  Note that even though the SLA_Consumer may decide to not use the SLA (which is equivalent to attribute discard), I think some of the checking (formatting of the values, for example) doesn't belong to the BGP Speaker.

M13.3 [nit] s/attribute-discard/attribute discard



M14. Section 7. (Traffic Class Mapping)  What is done with the SLA is out of scope!  The mapping, or not, of policies is also out of scope — specially the normative language included in the section.  Recommendation: delete this section completely.



M15. Section 11. (Security Considerations)

M15.1 "The QOS attribute defined in this document SHOULD be used by the managed networks for enforcing Quality of Service policies…"  The use of SLA information is out of scope!

M15.2 "…so there should not be any risks for identity thefts."  What does that mean?  Identity theft?  Are you talking about the possibility of hijacking?  (I'm just guessing because the next sentence talks about Origin Validation.)

M15.3 "To strengthen the security for the QoS attribute, RPKI based origin validation [RFC7115] MAY be used."  How does Origin Validation strengthen the security of the QoS attribute?  Origin Validation can help the receiver verify that the origin AS is in fact authorized to advertise an NLRI, but not that the valid prefix origin did in fact also originate the QoS attribute.  Section 4. (Originating SLA Notification) does say that  the "QoS attribute…MUST NOT be added during BGP propagation", but there doesn't seem to be a way to detect non-compliance making it an important security consideration.

M15.4 "…BGP Path Validation (e.g., [I-D.ietf-sidr-bgpsec-protocol]) procedures could be used over BGP QoS attribute and its associated prefix in producing the digital signature that can be carried within the signature SLA for the messages."  What is the "signature SLA"?  This text is proposing to change BGPSec — as is, this proposal presents no assurances over the *current* risk of a MITM (until maybe someday the proposal is specified and implemented).  I would rather you clearly identify why there is a risk of MITM — and then say that there is no solution (which is the same as for other transitive attributes).

M15.5 Speaking of MITM, what about privacy?  Even if the MITM doesn't change the QoS attribute it can still glean a lot of useful information from just forwarding the UPDATES.  Please include some text about privacy.  Please take a look at RFC6973.



M16. [T3] Section 10. (IANA Considerations)

M16.1 There is no guidance about the QoS Attr flags.  I know they are not used in this document, but because it is the document that defines them it would be good to have a registration criteria.

M16.2 [minor] I know the space is small, but the SubTypes and SLA Event Types have a registration procedure of Standards Action.  I wonder if having a value (maybe just one) for Private Use could be beneficial to allow people to put their own stuff in there…

M16.3 [minor] This sentence seems out of place since it provides no guidance to IANA: "QoS SLA Traffic Class Element Types will be referring to existing IPFIX IANA types as listed in Table 1."




Minor:

m1. [T1] The Introduction talks about the fact that other "protocols and data models are being created to standardize setting routing configuration parameters within networks.  YANG data models…the I2RS protocol", but that "some providers desire to exchange QoS parameters in-band utilizing BGP peering relationships."  Two related questions:

m1.1 Is there specific YANG/i2rs work in process that matches what is defined here?  How are the models similar/different?  Note that this is not a question about comparing the use of YANG models, the i2rs protocol and the mechanism using BGP  -- I just want to know if there are other similar efforts in progress and how they are in sync or not.  The underlying question is really about whether an operator will be able to use a NETCONF/i2rs to configure the SLA information received in a BGP session (as specified in this document).  I know that anything is possible, but is that the direction that is being taken today?

m1.2 Are there implementations?  It would be nice to add a section ala rfc6982 and point to the idr wiki from there.



m2. [T3] The new attribute is called by several different names in the document: "QoS Attribute", "QoS/SLA Attribute", "QoS BGP attribute", and simply "attribute" in a couple of cases.  Please be consistent!!!



m3.  Section 3.3.2.3. (Traffic Class for MINRATE_IN_PROFILE_MARKING)  I'm not sure how to interpret this:  "Marking code-point type = 8 bits (1 octet) IPFIX Element Identifier…The marking code-point type of 0x00 is a drop identifier; the length in this case is zero."   IPFIX has Element ID 0 as reserved.  I'm guessing that you mean that if the code-point is 0x00, to better yet, if the length is 0, then drop.   ???

m3.1 Same comment for 3.3.2.4, 3.3.2.5 and 3.3.2.6



m4. For all the sub-sections in 3.3.2, please put in a short explanation of what each TLV represents.



m5. Section 8. (Deployment Considerations)  I think this section should really be called "Use Cases" and moved earlier in the document, maybe even as part of the Introduction.  It would be very nice if the case for propagating the QoS Attribute through multiple AS hops was also illustrated.



m6. References

m6.1 RFC3270 should be Informative

m6.2 If RFC7115 is referenced as Normative, then so should I-D.ietf-sidr-bgpsec-protocol because you're offering them as solutions to security issues.  Please take a look at my comments on the Security Considerations Section above.

m6.3 For the illustration in the document, I don't think the reference to RFC7674 is needed.




Nits:

n1. s/…and then optionally followed by for the opposite direction/…and then optionally followed by the specification for the opposite direction

n2. Please expand on first use: TSpec

n3. In 3.3.2.3, please be explicit that the Value contains the Marking code-point type and value.

n4. In 4.1, "next hop to CE (that is the next hop address from CE to PE)"…shouldn't that be "next hop to PE"?