Re: [Gen-art] Gen-art telechat review of draft-ietf-mpls-tp-psc-itu-03.txt

Hi Elwyn,

> > I understand from the document editor that there is a revision in waiting to
be
> > posted that clears up some remaining nits from your review.
> I haven't seen this as yet, nor the psc-updates new version.

PSC-updates just posted (got stuck in a tools snafu).
Update to this I-D is pending until after IESG evaluation completes.

> I was trying to capture essentially two points:
> - As it stands this doc both introduces capabilities/modes and provides
> a definition of one new mode as well as redefining RFC 6378 as providing
> a basic mode.  In its introductory function I felt it needed to be
> explicit that if you in future wanted a new combination of capabilities
> maybe using some additional extra capabiliies then you need to specify
> some extra mode in another doc that tells you the combination of bits in
> the TLV for the new mode.  OK, there might never be any other modes but
> I was told that wasn't the original plan, but I don't understand why it
> isn't helpful to be explicit here.

OK. I was distracted by the words in your proposed text :-)
> > >    Only combinations of capabilities specified by modes will be
> > >    supported by implementations.
I read this to mean that implementations *of*this*specification* would reject
combinations of capabilities that were not those specified in the two modes
described here.
My reaction was to say: but the text describes this in great detail.

Now I see what you are saying with respect to capabilities so, we could update
the last three paragraphs of the Introduction:
OLD
   This document introduces capabilities and modes.  A capability is an
   individual behavior.  The capabilities of a node are advertised using
   the method given in this document.  A mode is a particular
   combination of capabilities.  Two modes are defined in this document:
   PSC mode and Automatic Protection Switching (APS) mode.

   This document describes the behavior, the priority logic, and the
   state machine of the PSC protocol when all the capabilities
   associated with the APS mode are enabled.  The PSC protocol behavior
   for the PSC mode is as defined in [RFC6378].

   This document updates [RFC6378] by adding a capability advertisement
   mechanism.  It is recommended that existing implementations of the
   PSC protocol be updated to support this capability.  Backward
   compatibility with existing implementations is described in
   Section 9.2.1.
NEW
   This document introduces capabilities and modes.  A capability is an
   individual behavior.  The capabilities of a node are advertised using
   the method given in this document.  A mode is a particular
   combination of capabilities.  Two modes are defined in this document:
   PSC mode and Automatic Protection Switching (APS) mode.  Other modes
   may be defined as new combinations of the capabilities defined in
   this document or through the definition of additional capabilities. 
   In either case, the specification defining a new mode will be 
   responsible for documenting the behavior, the priority logic, and the
   state machine of the PSC protocol when the set of capabilities in the
   new mode are enabled.

   This document describes the behavior, the priority logic, and the
   state machine of the PSC protocol when all the capabilities
   associated with the APS mode are enabled.  The PSC protocol behavior
   for the PSC mode is as defined in [RFC6378].

   This document updates [RFC6378] by adding a capability advertisement
   mechanism.  It is recommended that existing implementations of the
   PSC protocol be updated to support this mechanism.  Backward
   compatibility with existing implementations that do not support this
   mechanism is described in Section 9.2.1.

   Implementations are expected to be configured to support a specific
   set of capabilities (a mode) and to reject messages that indicate the
   use of a different set of capabilities (a different mode). Thus, the
   capabilities advertisement is not a negotiation, but a verification
   that peers are using the same mode.
END

> - I had understood from an email exchange with Huub that the authors had
> the concept that PSC messages with duff values (such as wrong lengths)
> would be picked off as 'invalid messages' and never make it to the main
> protocol engine (I assume that this will be addressed in the
> psc-updates).  Huub seemed to imply that a message with a set of
> capability bits that did not match a mode understood by the node would
> be treated as an invalid message rather than triggering the operator
> intervention.  This seemed sensible so that the alarm would only be

> triggered if an operator acciedentally reconfigured a different mode.

Implementations can fiddle with their alarm thresholds. It is likely that an
implementation that has never talked will raise a flag at once; that an
implementation might soak an individual event; and that an implementation could
track soak-avoiding flip-flops. However, I don't believe in telling people how
to write good code if it doesn't change the bits on the wire. (Well, I do
believe in telling them, but I also believe in being paid to tell them :-)

The latest revision of the PSC-updates draft does not mention behaviour on
receiving a "malformed" or "unknown" TLV. We planned to raise this point as a
last call comment to ensure it is properly discussed, and I have done so just
now that last call has started.

> I think Stephen Farrell has picked up on the DoS aspect of this in his
> tracker comment.

I don't think Stephen was talking about DoS. I think he referred to fat fingers,
and my response was that this is not something dynamic in the configuration
within a network. You'll pick one mode for all your nodes, or you'll pick the/an
other mode. So if you get it wrong on a new box/interface it just won't come up
properly and you'll fix it.

The random splat is, covered by sensible implementation and rarity.
A subverted node has far better things to do with its subversion.
A MitM attacker could tamper with these bits, but again, they could do far more
interesting things to packets if they are able to catch and modify them and
reconcile the lower layer, link-level security.

Cheers,
Adrian

> 
> Regards,
> Elwyn
> >
> > > Summary:
> > > Almost ready.  There are a couple of points which I raised at Last Call
> > > and discussed with the authors and others both by email and f2f in
> > > London that are not resolved.  These point revolve around being rigorous
> > > about wire encoding, clarifying error behaviour and being definite that
> > > implementations support modes as specfic combinations of capabilities so
> > > that arbitrary capability combinations are not allowed and result in
> > > invalid protocol messages.
> > >
> > > Major Issues:
> > >
> > > Minor Issues:
> > > s1: From my discussions with the authors and others associated with this
> > > document, it is my understanding that the intention is that only
> > > combinations of capabilities specified by modes should be legal and
> > > hence that implementations would support modes rather than arbitrary
> > > sets of capbilities. I think it would be worth being more explicit about
> > > this.  This would answer my comments at Last Call that it was unclear
> > > whether other combinations were allowed and would make it clear that a
> > > message that arrived with a corrupted bit in the flags field was
> > > definitely malformed.  I suggest adding the following text to para 16 of
> > > s1 (starts "This document introduces capabilities and modes.") before
> > > the last sentence:
> > >    Only combinations of capabilities specified by modes will be
> > >    supported by implementations.
> >
> > While this is true, it is also not helpful!
> > Any combination of capabilities (these five and any of the future
> > nearly-infinite number of capabilities that can be represented in the bit
field)
> > could be specified as supported (i.e. as a mode) in the future.
> > There are two points of note:
> > 1. Only two modes are currently defined
> > 2. Any future mode must be specified in combination with the state machines
> for
> > the mode.
> >
> > A message that is received containing a set of capabilities (i.e. a mode)
not
> > supported by the receiver would be rejected. See Section 9.1.1. That is,
this is
> > not a negotiation. This is a verification that both speakers are operating
in
> > the same mode.
> >
> > For future compatibility, there is no distinction between a corrupted set of
> > capability bits and an unknown mode.
> >
> > > Nits/Editorial Comments:
> > >
> > > s4.4, para 1:
> > > OLD:
> > > When the modified priorities specified in this document is in use,..
> > > NEW:
> > > When the modified priorities specified in this document are in use,..
> > > (or maybe better:)
> > > When the modified priority order specified in this document is in use,..
> > >
> > > s7.3 et seq: The term "selector bridge" is introduced without
> > > definition.  I suspect it is a piece of jargon I am supposed to know but
> > > I think a reference would help.
> >
> > Yes, it is a piece of standard terminology in protection switching. I'm sure
the
> > authors can find a reference.
> >
> > > s9.1: RFC 6378 doesn't define the encoding of the TLV type and TLV
> > > length fields, so it needs to be done here (Unsigned integers). It also
> > > doesn't define encoding of the overall TLV length field in
> > > the PSC header.  This may be thought to be 'obvious' but there is no
> > > default specified in IETF documents.
> >
> > This is being fixed in draft-ietf-mpls-psc-updates that updates 4368. New
> > revision about to be posted before IETF last call.
> >
> > > s9.1: Both RFC6378 and this document are incomplete as regards
> > > specifying what constitutes an invalid protocol message.  In particular
> > > there is no discussion of behaviour if correctly formed but unrecognized
> > > TLVs are received.  Do these make the message invalid or should they be
> > > ignored?
> >
> > This should be included in draft-ietf-mpls-psc-updates as well.
> >
> > > s9.1.1 and s12:
> > > In s12 it is stated (similar wording in s9.1.1):
> > > >    o  If the Capabilities TLV mismatches, the node MUST alert the
> > > >       operator and MUST NOT perform any protection switching until the
> > > >       operator resolves the mismatch in the Capabilities TLV.
> > > Having discussed the situation with the authors and others, I understand
> > > that there are circumstances, depending on the underlying transport,
> > > that bit errors might not be detected and hence that there is a small
> > > probability that corrupt PSC messages may be propagated up to the
> > > protocol machine.  At present there is no explicit statement that a
> > > corrupted flag word would be trapped as an invalid protocol message
> > > (this seems to be the intention) rather than triggering this operator
> > > alert.  I think that the best that can be done is specify that a PSC
> > > protocol message MUST have the flags for a recognized mode set exactly
> > > and otherwise it will be treated as an invalid message.  The wording in
> > > s9.1.1 and s12 would then catch an inadvertent reconfiguration.  I
> > > suggest adding the following to s9.1.1:
> > >    Any PSC message that has a combination of capability bits set that
> > >    does not correspond to a defined mode will be treated as an invalid
> > >    message and ignored.
> >
> > This is plain wrong!
> > The receiving device is set to operate in a single mode.
> > If the received message is not identical to that mode, it cannot operate.
> > Section 9.1.1 already explains how this is handled.
> > To restate: this is not a negotiation.
> > It is an announcement.
> >
> > The possibility of a corrupt message does exist. Neutrinos are remarkably
> > unpredictable beasts. And it is remotely possible that the error will arise
> > without the underlying transport detecting it. And it is further possible
that
> > the error will take out a single bit in the capabilities. The result is
> > indistinguishable from the sender deciding to tweak its capabilities. That
would
> > cause a mode mismatch and the process in 9.1.1 would be invoked. Given that
> it
> > is indistinguishable, why would this be a cause for any different behaviour?
> >
> > BTW, Stewart was asking some time back whether there was any record
> anywhere of
> > an MPLS packet that had been misdelivered because the label had had a
> corruption
> > event on the wire. We didn't come up with anything and the general feeling
> was
> > that hardware memory was far more vulnerable.
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Gen-art mailing list
> > Gen-art@ietf.org
> > https://www.ietf.org/mailman/listinfo/gen-art