[secdir] Secdir last call review of draft-ietf-ccamp-rsvp-te-bandwidth-availability-13

Sandra Murphy <sandy@tislabs.com> Sat, 02 February 2019 04:02 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Sandra Murphy <sandy@tislabs.com>
To: secdir@ietf.org
Cc: ccamp@ietf.org, ietf@ietf.org, draft-ietf-ccamp-rsvp-te-bandwidth-availability.all@ietf.org
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <154908012600.28521.5714645741731093529@ietfa.amsl.com>
Date: Fri, 01 Feb 2019 20:02:06 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/secdir/YUE5KwLMVd62rfJY8PXH0cy1_IU>
Subject: [secdir] Secdir last call review of draft-ietf-ccamp-rsvp-te-bandwidth-availability-13

Reviewer: Sandra Murphy
Review result: Has Issues

I have reviewed this document draft-ietf-ccamp-rsvp-te-bandwidth-availability
as part of the security directorate’s ongoing effort to review all IETF
documents being processed by the IESG. These comments were written primarily
for the benefit of the security area directors.  Document editors and WG chairs
should treat these comments just like any other last call comments.

This draft provides a new TLV for the Ethernet SENDER_TSPEC object that will
carry availability requirements for RSVP-TE signaling of GMPLS LSPs.

The draft is thorough.  I do have some comments.  I reviewed the normative
references RFCs 2205, 3209, 3473, 6003, as well as RFC3945 and RFC5920.  I
can’t claim that I understood everything in that stack, so the following could
easily be wrong.

Computing the LSP path:

Page 4, section 2 discusses obtaining network topology, calculating the LSP
route, RFC8330’s extensions for availability in OSPF TE LSA messages.  Does
this draft assume that this extension will always be used with an
EXPLICIT_ROUTE object?  Is this draft not applicable without that explicit LSP
route calculation?

Availability TLV vs CLASSTYPE objects:

The definition in RFC6003 of the Bandwidth Profile TLV has certain constraints
on the values of the Index:
                         The Index field value MUST correspond to at least one
      of the Class-Type values included either in the CLASSTYPE object
      [RFC4124] or in the EXTENDED_CLASSTYPE object [MCOS].

I am not certain if this means that the presence of an Ethernet SENDER_TSPEC
Object with a Bandwidth Profile TLV means there must be a CLASSTYPE object in
the RSVP-TE message as well, or that the Index field values are taken from the
set of defined Class-Type values.

But if the first, does this induce requirements by inclusion of the
Availability TLV that these other CLASSTYPE objects must be included as well? 
Or are you intending to update RFC6003 to eliminate that constraint?  If the
second, does the RFC6003 constraint also constrain the index values used in the
Availability TLV?  Should that be mentioned?  (Or am I just confused?)

Bandwidth TLV to Availability TLV association:

Page 4, Section 3.1 says

      When the Availability TLV is included, it MUST be present along
      with the Ethernet Bandwidth Profile TLV. If the bandwidth
      requirements in the multiple Ethernet Bandwidth Profile TLVs have
      different Availability requirements, multiple Availability TLVs
      SHOULD be carried. In such a case, the Availability TLV has a one
      to one correspondence with the Ethernet Bandwidth Profile TLV by
      having the same value of Index field. If all the bandwidth
      requirements in the Ethernet Bandwidth Profile have the same
      Availability requirement, one Availability TLV SHOULD be carried.
      In this case, the Index field is set to 0.

I find that the description is not clear in all cases.  I found a message in
the working group discussion of this association that the association is either
“n:n” or “n:1”.  I think this description sounds more like n 1:1 associations
or a  n:1 association.   Is that what is intended?  Can the associations be
mixed in the same message?  Suppose there were 3 Bandwidth TLVs that needed the
same availability and one that had a different availability need, could there
be 3 Bandwidth TLVs with index 0 and one Availability-TLV with index 0 and also
a Bandwidth TLVs - Availability TLV pair with matching index values?

error checking:

Other documents in the references (RFC2205, 3209, 3473, 6003, etc) have made a
point of explicitly describing the error handling - when PathErr and ResvErr
and Notify messages are sent, to whom, the error codes, the error value
sub-codes, etc.  I don’t see that here for the
bandwidth-tlv-to-availability-tlv associations.

Is a mix of index-zero and index-non-zero bandwidth-tlv-to-availability-tlv
associations (like above) an error? is the message dropped?  is an error sent? 
if the message is not dropped, are any of the bandwidth-tlv, availability-tlv
associations retained?

If there are availability-tlvs with non-zero indexes with no matching index
value among the bandwidth-tlvs, that surely is an error?  Is the message
dropped?  Or is the availability tlv dropped?  Is a PathErr/ResvErr message
sent?

Suppose all availability-tlvs have a matching (zero or non-zero) index value
among the bandwidth-tlvs, but there are extra bandwidth-tlvs (no
availability-tlv with a matching index value).  Is that an error?  Are the
extra bandwidth-tlvs dropped? ignored? propagated?

(RFC3209 has several cases where there might be extra objects or sub-objects
and the language is “can be/MAY be/SHOULD be/are ignored and SHOULD NOT be /are
not/need not be propagated” )

multiplicity:

RFC3209 says it does not apply to multicast, but it does talk about multiple
parallel LSP tunnels between two nodes, and about multipoint-to-point LSPs for
WF and SE style reservations when there are multiple senders, and about the
merging rules of WF reservations.  Does availability work in those style
reservations?

availability vs “variable discrete bandwidth”:

I believe I understood the discussion of the need to signal availability
requirements in order for the system to determine when an LSP was feasible.  I
can dimly understand that there might be links have “variable discrete
bandwidth”.  Section 2 says “The Availability TLV can be applicable to any kind
of physical links with variable discrete bandwidth, such as microwave or DSL.”
Why not other link types? Do only “variable discrete bandwidth” links support
availability?

calculating availability:

In page 9, Appendix A:

Perhaps I don’t understand how the availability metric is used.  In the
following:

   On a sunny day, the modulation level 3 can be used to achieve 400
   Mbps link bandwidth.

   A light rain with X mm/h rate triggers the system to change the
   modulation level from level 3 to level 2, with bandwidth changing
   from 400 Mbps to 200 Mbps. The probability of X mm/h rain in the
   local area is 52 minutes in a year. Then the dropped 200 Mbps
   bandwidth has 99.99% availability.

I would say that the 400Mbps bandwidth is available whenever it is not raining.
 It lightly rains 52 min year, which means it is not raining 99.99% of the
time, so the 400Mbps availability is 99.99%.  The 200Mbps is available during
that 52 min, so 99.99% is not the 200Mbps availability. Right?

The analogous comment applies to the next two paragraphs.

Does that explain why the table shows the 100Mbps bandwidth having two
different availabilities?

security:

The draft (*) security consideration points to RSVP-TE, but without an RFC
reference, and to RFC5920.  Because this is a GMPLS related feature, it should
refer to the GMPLS extensions to RSVP-TE in RFC3473.  As an extension to
RFC6003, it could refer to that RFC’s security considerations section, but that
only gets the reader to RFC3473, RFC3209, and RFC5920.

The security considerations for RSVP-TE itself (RFC3209) points to RFC2205. 
RFC2205 defines an Integrity object (defined in RFC2747) that carries a keyed
cryptographic digest based on a shared key, providing hop-by-hop protection
between two RSVP nodes.  However, PATH messages are directed toward the traffic
destination address, not the next RSVP node.  There could be clouds of non-RSVP
nodes between two RSVP nodes that the PATH encounters.  This makes it difficult
to share a key between individual pairs of RSVP nodes, and could motivate
operators to configure the same key in large numbers of RSVP nodes.

RFC3473 points to RFC2747’s protection of RSVP messages.  It also notes that it
introduces a Notify message that is not sent to the traffic destination address
but instead to a node that requested notification.  One transmission option is
that the NOTIFY is encapsulated in an IP packet and forwarded directly to the
requesting node.  That complicates the Integrity object protection, unless the
shared key is widely shared.

RFC3945 notes that authentication in GMPLS systems may use the authentication
mechanisms of the component protocols, pointing to RFC2747 (as well as others
for LDP, LMP, etc that don’t apply here).

RFC5920 discusses threats, attacks, and protections for MPLS/GMPLS data and
control planes.  Section 7.1.2 in particular talks about “Control-Plane
Protection with RSVP-TE”, and could be mentioned here explicitly.  It talks
about network border configuration to limit external attacks, and mentions
RFC2747 authentication protections, making some of the same points about
non-RSVP clouds and shared keys configuration.  It also points to RFC4230,
which is a very detailed look at RSVP security, and probably deserves to be
mentioned here.

So all told, at the end of all the reference chains, the only defined
authentication and integrity protection in 2205, 3209, and 3473 is based on
shared keys that are very difficult to configure with fine granularity.

However, as was said in reply to a different MPLS related draft review
yesterday:

    The MPLS network is often considered to be a closed network such that
    insertion, modification, or inspection of packets by an outside party is
    not possible.

So maybe that is accepted as sufficient in deployment.

MPLS documents are also typically granted an exception from more rigorous
security requirements because MPLS is used only within one routing domain / ISP
/ provider / etc, under a single administrative control, so errors made would
not be global in impact.  In particular, errors that might result from one
legitimate but faulty/mis-configured/subverted/malicious MPLS node should not
propagate out to the general Internet.  (**)

Nits:

floating numbers

Page 5, Section 3.1, says “a 32-bit floating number”.  I believe you mean a
floating-point number.  I checked other IETF RFCs (e.g., RFC8330), and it is
common to mention the IEEE 754-2008 standard when including a floating point
value in the spec.

But is a floating point value needed?  The draft says that the values are
typically in a small set of known values.  The intro sounds like a small set of
classes are used for “efficient planning”.  Just curious.  OTOH, RFC8330 uses
floating point, and the ITU documents’ calculation of availability make it seem
like full floating point is needed.

the Availability TLV format:

page 5, section 3.1 says:

                                               The Availability TLV has
   the following format:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |    Index      |                 Reserved                      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          Availability                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                          Figure 1: Availability TLV

I presume that this is just the Value portion of the TLV format that is defined
for the Ethernet SENDER_TSPEC Object in Section 4 of RFC6003.

Page 1, Abstract:

   typically used for describing these links when during network
   planning

“when during” - is that deliberate?  It sounds redundant, maybe due to editing.
 Or maybe it was supposed to be “when doing”?

   signaling. This extension can be used to set up a Generalized Multi-
   Protocol Label Switching (GMPLS) Label Switched Path (LSP) using the
   Ethernet SENDER_TSPEC object.

not sure - what is using the SENDER_TSPEC - the LSP or this extension?

Page 3, Section 1:

   bandwidth availability. For example, the bandwidth with 99.999%
   availability of a link is 100 Mbps; the bandwidth with 99.99%
   availability is 200 Mbps.

maybe:

   bandwidth availability. Suppose, for example, the bandwidth with 99.999%

Page 5, section 3.2:

   TLVs and one or more Availability TLVs. Each Ethernet Bandwidth
   Profile TLV corresponds to an availability parameter in the
   Availability TLV.

… “in an Availability TLV”? or “in the associated Availability TLV”? There’s
more than one.

Page 6, section 3.2

        link), it SHOULD reserve the bandwidth resource from each

“it” -> “the node”

       this LSP. Optionally, the higher availability bandwidth can be

“the higher” -> “a higher”  (there’s more than one, right?)

        request cannot be satisfied, it SHOULD generate PathErr message

“it” -> “the node”

   generate PathErr message with the error code "Extended Class-Type

“PathErr” -> “a PathErr” or “PathErr messages”

postscripts:

(**) [[[ I will note that RFC3209 includes an AS number subject among the
subobjects of the EXPLICIT_ROUTE object.  With the idea that you might set up
explicit routes that go through multiple ASNs.  Ouch.  I know there are
providers who have different ASNs under single administrative control, from
acquisitions or business use cases, but this just makes it possible for an
explicit route for an LSP to be misconfigured to include your (external)
neighbor ASN.  And RFC5920 talks about “ASBR-ASBR communication for inter-AS
LSPs”.  Better have good outbound filters on your border routers. ]]]

(*)As is typical for specifications that extend other published RFCs, this
draft says it “does not introduce any new security considerations”.

<begin soapbox> In general, I am skeptical of extension drafts that make such
claims.  Surely the existing security considerations should be examined to see
how they apply to this new feature or object being introduced?  Do current
protections adequately protect the new feature/object?  Does the new
feature/object carry new information, produce new behaviors?  etc. But this is
so very common I could hardly request that more be said here. Just saying. <end
soapbox>

Re: [secdir] Secdir last call review of draft-iet… Sandra Murphy
[secdir] Secdir last call review of draft-ietf-cc… Sandra Murphy
Re: [secdir] Secdir last call review of draft-iet… Yemin (Amy)
Re: [secdir] [CCAMP] Secdir last call review of d… tom petch
Re: [secdir] [CCAMP] Secdir last call review of d… Yemin (Amy)