Re: [mpls] Spencer Dawkins' Discuss on draft-ietf-mpls-tp-shared-ring-protection-05: (with DISCUSS and COMMENT)

Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com> Thu, 25 May 2017 12:58 UTC

MIME-Version: 1.0
In-Reply-To: <b342ad77-1cd2-ebc9-4c84-337eeb4a00e8@gmail.com>
References: <149565660910.8641.739437988075507213.idtracker@ietfa.amsl.com> <b342ad77-1cd2-ebc9-4c84-337eeb4a00e8@gmail.com>
From: Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com>
Date: Thu, 25 May 2017 07:58:40 -0500
Message-ID: <CAKKJt-ewdvUik3qhC3zsSAREUOtKfGUEjwa2-69jes1pma94Fw@mail.gmail.com>
To: huubatwork@gmail.com
Cc: The IESG <iesg@ietf.org>, draft-ietf-mpls-tp-shared-ring-protection@ietf.org, Eric Gray <Eric.Gray@ericsson.com>, "mpls-chairs@ietf.org" <mpls-chairs@ietf.org>, "mpls@ietf.org" <mpls@ietf.org>
Content-Type: multipart/alternative; boundary="001a1143e3bea9d3aa055058c73c"
Archived-At: <https://mailarchive.ietf.org/arch/msg/mpls/DYpikvLY01eDaWJ9Y5BevgIQVMo>
Subject: Re: [mpls] Spencer Dawkins' Discuss on draft-ietf-mpls-tp-shared-ring-protection-05: (with DISCUSS and COMMENT)
Precedence: list

To everyone else except Huub, because he shouldn't see this e-mail before
the telechat anyway, being on vacation :D

On Thu, May 25, 2017 at 5:02 AM, Huub van Helvoort <huubatwork@gmail.com>
wrote:

> Hello Spencer,
>
> Thank you for your review of our draft.
> Please find my response in-line [Huub]
>
> Spencer Dawkins has entered the following ballot position for
>> draft-ietf-mpls-tp-shared-ring-protection-05: Discuss
>>
>> The document, along with other ballot positions, can be found here:
>> https://datatracker.ietf.org/doc/draft-ietf-mpls-tp-shared-r
>> ing-protection/
>>
>> ----------------------------------------------------------------------
>> DISCUSS:
>> ----------------------------------------------------------------------
>>
>> I want to thank the authors for a very readable draft. It was a pleasure
>> to review, and that's a high bar for the subject.
>>
>
> [Huub] thank you!
>
> I have loads of questions, but my first set of questions is an expansion
>> of Alvaro's comment that I think rises to the level of a Discuss. Please
>> note that I'm asking questions, not proposing text changes, so I really
>> do want to discuss it.
>>
>
> [Huub] OK, understood.
>
> ---------- my first set of questions
>>
>> In this text,
>>
>>     Three typical ring protection mechanisms are described in this
>>     section: wrapping, short wrapping and steering.  All nodes on the
>>     same ring MUST use the same protection mechanism.
>>
>> I would like to understand what happens if they aren't - and I'm asking,
>> mostly as a way of encouraging guidance for operators in debugging cases
>> where they're not all using the same mechanism. I'm not asking for a full
>> mesh of possible misconfigurations, only for a sentence or two ("If they
>> aren't all using the same protection mechanism, the following things may
>> happen").
>>
>
> [Huub] if the MRPS protocol in any node detects RPS message with a
> mode that was not provisioned in that node a failure of protocol will
> be reported, and the protection mechanism will not be activated.


This sentence, alone, would be enough to answer this part of my Discuss, if
it were in the document.

More broadly, I'd like to understand why wrapping and short wrapping are
>> both defined. It seems like the only functional difference is that short
>> wrapping doesn't give you as much latency. Is that right?
>>
>> 24 pages in, I see this:
>>
>>     o  In rings utilizing the wrapping protection, each node detects the
>>        failure or receives the RPS request as the destination node MUST
>>        perform the switch from/to the working ring tunnels to/from the
>>        protection ring tunnels if it has no higher priority active RPS
>>        request.
>>
>>     o  In rings utilizing the short wrapping protection, each node
>>        detects the failure or receives the RPS request as the
>> destination
>>        node MUST perform the switch only from the working ring tunnels
>> to
>>        the protection ring tunnels.
>>
>> so I'm pretty sure there are differences beyond what I was seeing,
>> earlier in the document.
>>
>
> [Huub] wrapping is a mechanism that can be used in case an LSP is dropped
> in several nodes (p-2-mp application). In this case the traffic will still
> have
> to reach every node in the ring.
> Short wrapping can be used only in p-2-p application. Now the traffic can
> be dropped at the moment the egress node is reached.
> Steering has the least additional propagation delay, but during a short
> time
> the traffic may be duplicated, which may not be desirable is some
> applications.


This explanation would be sufficient to answer this part of my Discuss, if
it was in the document.

It is somewhere between possible and likely that there's actually a
reference that describes the mechanisms in more detail, so if you included
something like "the functional differences between these protection
mechanisms are described in [WhatEver]", that would be fine, instead.


>
> And, of course, I'm not sure what the effect of choosing steering over
>> wrapping/short wrapping would be, for my users, but that can wait until
>> we talk about wrapping and short wrapping ...
>>
>> At a minimum, I'd like to see guidance for operators in choosing among
>> the three protection mechanisms. Why would they choose any one of the
>> three?
>>
>
> [Huub] more explanatory text can be added, it could be in the introduction
> proposed by Alvaro.


I only saw Alvaro saying that he also wanted operational guidance included,
but it's very likely that whatever you do to answer Alvaro's Comment would
answer this part of my DIscuss. We were complaining about the same thing,
it's just that I was more concerned about it ;-)


>
> I also note that this MUST seems to be repeated using different words in
>> section 5.1, as
>>
>>     All nodes in the same ring MUST use the same protection mechanism,
>>     Wrapping, steering or short-wrapping.
>>
>> If that's saying the same thing, one MUST is all you need.
>>
>
> [Huub] OK, point taken.


Thanks. I wasn't sure whether it was saying the same thing in different
words, or saying something different. This sounds like it's saying the same
thing in different words, and can be safely deleted.

So - from 10,000 meters up, it looks like we know how to clear my Discuss.
Please let me know when you (get back from vacation and) submit the next
version, and I'll clear.

I look forward to chatting about my Comments, but we won't be Discussing
them, only chatting.

Spencer


>
> ----------------------------------------------------------------------
>> COMMENT:
>> ----------------------------------------------------------------------
>>
>> ---------- all the other questions
>>
>
> [Huub] I will answer your questions later
> (I am currently on holiday and risk a divorce if I don't stop now :-(  )
>
> Best regards, Huub.
>
>
>
>> In this text,
>>
>>     When the service LSP passes through the interconnected rings, the
>>     direction of the working ring tunnels used on both rings SHOULD be
>>     the same.  For example, if the service LSP uses the clockwise
>> working
>>     ring tunnel on Ring1, when the service LSP leaves Ring1 and enters
>>     Ring2, the working ring tunnel used on Ring2 SHOULD also follow the
>>     clockwise direction.
>>
>> I'm not understanding why this is a SHOULD, and not a MUST. If the
>> direction of the working ring tunnels used on both rings is not the same,
>> does this still work?
>>
>> If it still works, why does this matter? But, either way, you might
>> usefully say something about why this isn't always the right thing to do,
>> even if you just give one example. The point of SHOULD is that
>> implementers make their own informed decisions, so providing information
>> that will inform those decisions seems important.
>>
>> I wanted to call out
>>
>>     Ring switches MUST be preempted by higher priority RPS requests.
>> For
>>     example, consider a protection switch that is active due to a manual
>>     switch request on the given link, and another protection switch is
>>     required due to a failure on another link.  Then an RPS request MUST
>>     be generated, the former protection switch MUST be dropped, and the
>>     latter protection switch established.
>>
>>     MSRP mechanism SHOULD support multiple protection switches in the
>>     ring, resulting in the ring being segmented into two or more
>> separate
>>     segments.  This may happen when several RPS requests of the same
>>     priority exist in the ring due to multiple failures or external
>>     switch commands.
>>
>> as really good examples of the kind of text I think would help the places
>> in this document ("For example", "This may happen when") where no
>> examples are given. Thanks for providing those examples!
>>
>> Ouch. Do I understand from
>>
>>     o  Protection Switching Mode (M): This 2-bit field indicates the
>>        protection switching mode used by the sending node of the RPS
>>        message.  This can be used to check that the ring nodes on the
>>        same ring use the same protection switching mechanism.  The
>>        defined values of the M field are listed as below:
>>
>>               +------------------+-----------------------------+
>>               |  Bits (MSB-LSB)  |   Protecton Switching Mode  |
>>               +------------------+-----------------------------+
>>               |       0 0        |         Reserved            |
>>               |       0 1        |         Wrapping            |
>>               |       1 0        |       Short Wrapping        |
>>               |       1 1        |         Steering            |
>>               +------------------+-----------------------------+
>>
>> that you already have three protection mechanisms, and have only one
>> possible codepoint to allocate for any future optimizations? Assuming
>> that "0 0" can be unReserved ...
>>
>> Could you clarify what "anyway" means in this text?
>>
>>     When multiple MS RPS requests exist at the same time addressing
>>     different links and there is no higher priority request on the ring,
>>     no switch SHOULD be executed and existing switches MUST be dropped.
>>     The nodes MUST signal, anyway, the MS RPS request code.
>>
>> I'm seeing that the commands like LP described in section 5.2.1.1  are
>> used in the document before these (I'm serious) helpful and clear
>> explanations appear. If it's possible to move section 5.2.1.1 up in the
>> document, that would be great, but if it isn't possible, a forward
>> pointer would be helpful to readers who don't already know what the
>> command abbreviations mean.
>>
>> I'm really confused by this SHOULD:
>>
>>     The PSC protocol [RFC6378] is designed for point-to-point LSPs, on
>>     which the protection switching can only be performed on one or both
>>     of the end points of the LSP.  The RPS protocol is designed for ring
>>     tunnels, which consist of multiple ring nodes, and the failure could
>>     happen on any segment of the ring, thus RPS SHOULD be capable of
>>     identifying and handling the different failures on the ring, and
>>     coordinating the protection switching behavior of all the nodes on
>>     the ring.
>>
>> I suspect that's because it's not a 2119 SHOULD, but if people think it
>> is, I wouldn't mind understanding why.
>>
>> Section 5.3, "RPS and PSC Comparison on Ring Topology" is really helpful,
>> but it appears 43 pages in. Given that I'd expect people to be asking why
>> they should implement a new protection switching protocol when they've
>> already implemented PSC, I'd think this would be much more useful, early
>> in the document.
>>
>> I'm somewhat confused about the code point allocation strategy in this
>> text:
>>
>>     The RPS Request Field is 8 bits, the allocated values are as
>> follows:
>>
>>         Value       Description               Reference
>>        -------  --------------------------- ---------------
>>           0     No Request (NR)             this document
>>           1     Reverse Request (RR)        this document
>>           2     unassigned
>>           3     Exercise (EXER)             this document
>>           4     unassigned
>>           5     Wait-To-Restore (WTR)       this document
>>           6     Manual Switch (MS)          this document
>>          7-10   unassigned
>>          11     Signal Fail (SF)            this document
>>          12     unassigned
>>          13     Forced Switch (FS)          this document
>>          14     unassigned
>>          15     Lockout of Protection (LP)  this document
>>        16-254   unassigned
>>          255    Reserved
>>
>> My first question is, why the highest priority RPS value is 15, given
>> that the field is 8 bits wide. If anyone ever needs to add a code point
>> higher than the highest priority code point, will that work well? I can
>> imagine code that says "if operation_priority is greater than
>> highest_priority, it's an error", for example.
>>
>> I may have other questions depending on your answer, but let's start
>> there.
>>
>>
>>
>
> --
> ================================================================
> Always remember that you are unique...just like everyone else...
>
>

[mpls] Spencer Dawkins' Discuss on draft-ietf-mpl… Spencer Dawkins
Re: [mpls] Spencer Dawkins' Discuss on draft-ietf… Huub van Helvoort
Re: [mpls] Spencer Dawkins' Discuss on draft-ietf… Spencer Dawkins at IETF
Re: [mpls] Spencer Dawkins' Discuss on draft-ietf… Alvaro Retana (aretana)
Re: [mpls] Spencer Dawkins' Discuss on draft-ietf… Huub van Helvoort
Re: [mpls] Spencer Dawkins' Discuss on draft-ietf… Spencer Dawkins at IETF
Re: [mpls] Spencer Dawkins' Discuss on draft-ietf… Spencer Dawkins at IETF
Re: [mpls] Spencer Dawkins' Discuss on draft-ietf… Dongjie (Jimmy)