Re: [Gen-art] Gen-art telechat review of draft-ietf-mpls-tp-psc-itu-03.txt

Elwyn,

Thanks for the review & raising this important issue. Adrian's text suggestion looks like a good way to address the main concern (to me at least, and without going through the details myself). Your thoughts?

Jari

On Mar 26, 2014, at 10:39 PM, Adrian Farrel <adrian@olddog.co.uk> wrote:

> Hi Elwyn,
> 
>>> I understand from the document editor that there is a revision in waiting to
> be
>>> posted that clears up some remaining nits from your review.
>> I haven't seen this as yet, nor the psc-updates new version.
> 
> PSC-updates just posted (got stuck in a tools snafu).
> Update to this I-D is pending until after IESG evaluation completes.
> 
>> I was trying to capture essentially two points:
>> - As it stands this doc both introduces capabilities/modes and provides
>> a definition of one new mode as well as redefining RFC 6378 as providing
>> a basic mode.  In its introductory function I felt it needed to be
>> explicit that if you in future wanted a new combination of capabilities
>> maybe using some additional extra capabiliies then you need to specify
>> some extra mode in another doc that tells you the combination of bits in
>> the TLV for the new mode.  OK, there might never be any other modes but
>> I was told that wasn't the original plan, but I don't understand why it
>> isn't helpful to be explicit here.
> 
> OK. I was distracted by the words in your proposed text :-)
>>>>   Only combinations of capabilities specified by modes will be
>>>>   supported by implementations.
> I read this to mean that implementations *of*this*specification* would reject
> combinations of capabilities that were not those specified in the two modes
> described here.
> My reaction was to say: but the text describes this in great detail.
> 
> Now I see what you are saying with respect to capabilities so, we could update
> the last three paragraphs of the Introduction:
> OLD
>   This document introduces capabilities and modes.  A capability is an
>   individual behavior.  The capabilities of a node are advertised using
>   the method given in this document.  A mode is a particular
>   combination of capabilities.  Two modes are defined in this document:
>   PSC mode and Automatic Protection Switching (APS) mode.
> 
>   This document describes the behavior, the priority logic, and the
>   state machine of the PSC protocol when all the capabilities
>   associated with the APS mode are enabled.  The PSC protocol behavior
>   for the PSC mode is as defined in [RFC6378].
> 
>   This document updates [RFC6378] by adding a capability advertisement
>   mechanism.  It is recommended that existing implementations of the
>   PSC protocol be updated to support this capability.  Backward
>   compatibility with existing implementations is described in
>   Section 9.2.1.
> NEW
>   This document introduces capabilities and modes.  A capability is an
>   individual behavior.  The capabilities of a node are advertised using
>   the method given in this document.  A mode is a particular
>   combination of capabilities.  Two modes are defined in this document:
>   PSC mode and Automatic Protection Switching (APS) mode.  Other modes
>   may be defined as new combinations of the capabilities defined in
>   this document or through the definition of additional capabilities. 
>   In either case, the specification defining a new mode will be 
>   responsible for documenting the behavior, the priority logic, and the
>   state machine of the PSC protocol when the set of capabilities in the
>   new mode are enabled.
> 
>   This document describes the behavior, the priority logic, and the
>   state machine of the PSC protocol when all the capabilities
>   associated with the APS mode are enabled.  The PSC protocol behavior
>   for the PSC mode is as defined in [RFC6378].
> 
>   This document updates [RFC6378] by adding a capability advertisement
>   mechanism.  It is recommended that existing implementations of the
>   PSC protocol be updated to support this mechanism.  Backward
>   compatibility with existing implementations that do not support this
>   mechanism is described in Section 9.2.1.
> 
>   Implementations are expected to be configured to support a specific
>   set of capabilities (a mode) and to reject messages that indicate the
>   use of a different set of capabilities (a different mode). Thus, the
>   capabilities advertisement is not a negotiation, but a verification
>   that peers are using the same mode.
> END
> 
>> - I had understood from an email exchange with Huub that the authors had
>> the concept that PSC messages with duff values (such as wrong lengths)
>> would be picked off as 'invalid messages' and never make it to the main
>> protocol engine (I assume that this will be addressed in the
>> psc-updates).  Huub seemed to imply that a message with a set of
>> capability bits that did not match a mode understood by the node would
>> be treated as an invalid message rather than triggering the operator
>> intervention.  This seemed sensible so that the alarm would only be
> 
>> triggered if an operator acciedentally reconfigured a different mode.
> 
> Implementations can fiddle with their alarm thresholds. It is likely that an
> implementation that has never talked will raise a flag at once; that an
> implementation might soak an individual event; and that an implementation could
> track soak-avoiding flip-flops. However, I don't believe in telling people how
> to write good code if it doesn't change the bits on the wire. (Well, I do
> believe in telling them, but I also believe in being paid to tell them :-)
> 
> The latest revision of the PSC-updates draft does not mention behaviour on
> receiving a "malformed" or "unknown" TLV. We planned to raise this point as a
> last call comment to ensure it is properly discussed, and I have done so just
> now that last call has started.
> 
>> I think Stephen Farrell has picked up on the DoS aspect of this in his
>> tracker comment.
> 
> I don't think Stephen was talking about DoS. I think he referred to fat fingers,
> and my response was that this is not something dynamic in the configuration
> within a network. You'll pick one mode for all your nodes, or you'll pick the/an
> other mode. So if you get it wrong on a new box/interface it just won't come up
> properly and you'll fix it.
> 
> The random splat is, covered by sensible implementation and rarity.
> A subverted node has far better things to do with its subversion.
> A MitM attacker could tamper with these bits, but again, they could do far more
> interesting things to packets if they are able to catch and modify them and
> reconcile the lower layer, link-level security.
> 
> Cheers,
> Adrian
> 
>> 
>> Regards,
>> Elwyn
>>> 
>>>> Summary:
>>>> Almost ready.  There are a couple of points which I raised at Last Call
>>>> and discussed with the authors and others both by email and f2f in
>>>> London that are not resolved.  These point revolve around being rigorous
>>>> about wire encoding, clarifying error behaviour and being definite that
>>>> implementations support modes as specfic combinations of capabilities so
>>>> that arbitrary capability combinations are not allowed and result in
>>>> invalid protocol messages.
>>>> 
>>>> Major Issues:
>>>> 
>>>> Minor Issues:
>>>> s1: From my discussions with the authors and others associated with this
>>>> document, it is my understanding that the intention is that only
>>>> combinations of capabilities specified by modes should be legal and
>>>> hence that implementations would support modes rather than arbitrary
>>>> sets of capbilities. I think it would be worth being more explicit about
>>>> this.  This would answer my comments at Last Call that it was unclear
>>>> whether other combinations were allowed and would make it clear that a
>>>> message that arrived with a corrupted bit in the flags field was
>>>> definitely malformed.  I suggest adding the following text to para 16 of
>>>> s1 (starts "This document introduces capabilities and modes.") before
>>>> the last sentence:
>>>>   Only combinations of capabilities specified by modes will be
>>>>   supported by implementations.
>>> 
>>> While this is true, it is also not helpful!
>>> Any combination of capabilities (these five and any of the future
>>> nearly-infinite number of capabilities that can be represented in the bit
> field)
>>> could be specified as supported (i.e. as a mode) in the future.
>>> There are two points of note:
>>> 1. Only two modes are currently defined
>>> 2. Any future mode must be specified in combination with the state machines
>> for
>>> the mode.
>>> 
>>> A message that is received containing a set of capabilities (i.e. a mode)
> not
>>> supported by the receiver would be rejected. See Section 9.1.1. That is,
> this is
>>> not a negotiation. This is a verification that both speakers are operating
> in
>>> the same mode.
>>> 
>>> For future compatibility, there is no distinction between a corrupted set of
>>> capability bits and an unknown mode.
>>> 
>>>> Nits/Editorial Comments:
>>>> 
>>>> s4.4, para 1:
>>>> OLD:
>>>> When the modified priorities specified in this document is in use,..
>>>> NEW:
>>>> When the modified priorities specified in this document are in use,..
>>>> (or maybe better:)
>>>> When the modified priority order specified in this document is in use,..
>>>> 
>>>> s7.3 et seq: The term "selector bridge" is introduced without
>>>> definition.  I suspect it is a piece of jargon I am supposed to know but
>>>> I think a reference would help.
>>> 
>>> Yes, it is a piece of standard terminology in protection switching. I'm sure
> the
>>> authors can find a reference.
>>> 
>>>> s9.1: RFC 6378 doesn't define the encoding of the TLV type and TLV
>>>> length fields, so it needs to be done here (Unsigned integers). It also
>>>> doesn't define encoding of the overall TLV length field in
>>>> the PSC header.  This may be thought to be 'obvious' but there is no
>>>> default specified in IETF documents.
>>> 
>>> This is being fixed in draft-ietf-mpls-psc-updates that updates 4368. New
>>> revision about to be posted before IETF last call.
>>> 
>>>> s9.1: Both RFC6378 and this document are incomplete as regards
>>>> specifying what constitutes an invalid protocol message.  In particular
>>>> there is no discussion of behaviour if correctly formed but unrecognized
>>>> TLVs are received.  Do these make the message invalid or should they be
>>>> ignored?
>>> 
>>> This should be included in draft-ietf-mpls-psc-updates as well.
>>> 
>>>> s9.1.1 and s12:
>>>> In s12 it is stated (similar wording in s9.1.1):
>>>>>   o  If the Capabilities TLV mismatches, the node MUST alert the
>>>>>      operator and MUST NOT perform any protection switching until the
>>>>>      operator resolves the mismatch in the Capabilities TLV.
>>>> Having discussed the situation with the authors and others, I understand
>>>> that there are circumstances, depending on the underlying transport,
>>>> that bit errors might not be detected and hence that there is a small
>>>> probability that corrupt PSC messages may be propagated up to the
>>>> protocol machine.  At present there is no explicit statement that a
>>>> corrupted flag word would be trapped as an invalid protocol message
>>>> (this seems to be the intention) rather than triggering this operator
>>>> alert.  I think that the best that can be done is specify that a PSC
>>>> protocol message MUST have the flags for a recognized mode set exactly
>>>> and otherwise it will be treated as an invalid message.  The wording in
>>>> s9.1.1 and s12 would then catch an inadvertent reconfiguration.  I
>>>> suggest adding the following to s9.1.1:
>>>>   Any PSC message that has a combination of capability bits set that
>>>>   does not correspond to a defined mode will be treated as an invalid
>>>>   message and ignored.
>>> 
>>> This is plain wrong!
>>> The receiving device is set to operate in a single mode.
>>> If the received message is not identical to that mode, it cannot operate.
>>> Section 9.1.1 already explains how this is handled.
>>> To restate: this is not a negotiation.
>>> It is an announcement.
>>> 
>>> The possibility of a corrupt message does exist. Neutrinos are remarkably
>>> unpredictable beasts. And it is remotely possible that the error will arise
>>> without the underlying transport detecting it. And it is further possible
> that
>>> the error will take out a single bit in the capabilities. The result is
>>> indistinguishable from the sender deciding to tweak its capabilities. That
> would
>>> cause a mode mismatch and the process in 9.1.1 would be invoked. Given that
>> it
>>> is indistinguishable, why would this be a cause for any different behaviour?
>>> 
>>> BTW, Stewart was asking some time back whether there was any record
>> anywhere of
>>> an MPLS packet that had been misdelivered because the label had had a
>> corruption
>>> event on the wire. We didn't come up with anything and the general feeling
>> was
>>> that hardware memory was far more vulnerable.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Gen-art mailing list
>>> Gen-art@ietf.org
>>> https://www.ietf.org/mailman/listinfo/gen-art
> 
> _______________________________________________
> Gen-art mailing list
> Gen-art@ietf.org
> https://www.ietf.org/mailman/listinfo/gen-art