[mpls] AD review of draft-ietf-mpls-gach-adv

"Adrian Farrel" <adrian@olddog.co.uk> Fri, 11 January 2013 22:29 UTC

Return-Path: <adrian@olddog.co.uk>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DFDC021F8A99 for <mpls@ietfa.amsl.com>; Fri, 11 Jan 2013 14:29:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.556
X-Spam-Level:
X-Spam-Status: No, score=-2.556 tagged_above=-999 required=5 tests=[AWL=0.043, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u9YtQaEULhPo for <mpls@ietfa.amsl.com>; Fri, 11 Jan 2013 14:29:30 -0800 (PST)
Received: from asmtp4.iomartmail.com (asmtp4.iomartmail.com [62.128.201.175]) by ietfa.amsl.com (Postfix) with ESMTP id 5282E21F8A52 for <mpls@ietf.org>; Fri, 11 Jan 2013 14:29:29 -0800 (PST)
Received: from asmtp4.iomartmail.com (localhost.localdomain [127.0.0.1]) by asmtp4.iomartmail.com (8.13.8/8.13.8) with ESMTP id r0BMTS3K024880; Fri, 11 Jan 2013 22:29:28 GMT
Received: from 950129200 (089144192060.atnat0001.highway.a1.net [89.144.192.60]) (authenticated bits=0) by asmtp4.iomartmail.com (8.13.8/8.13.8) with ESMTP id r0BMTPoD024858 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 11 Jan 2013 22:29:27 GMT
From: Adrian Farrel <adrian@olddog.co.uk>
To: draft-ietf-mpls-gach-adv.all@tools.ietf.org
Date: Fri, 11 Jan 2013 22:29:25 -0000
Message-ID: <01ec01cdf04b$1ee32af0$5ca980d0$@olddog.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
Thread-Index: Ac3wSxV3qBP+BoqVSkm3NwGCMJyhBQ==
Content-Language: en-gb
Cc: mpls@ietf.org
Subject: [mpls] AD review of draft-ietf-mpls-gach-adv
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: adrian@olddog.co.uk
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mpls>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Jan 2013 22:29:32 -0000

Hi,

I have conducted my usual AD review of your document as part of the
publication request processing. The aim of my review is to catch any
issues that might show up in IETF last call, directorate reviews, or
IESG evaluation. The intent is to resolve the issues by text changes
or through email discussion to smooth the passage of the document
through the later stages of the process.

I have reviewed this document jointly with draft-ietf-mpls-tp-
ethernet-addressing and since the author team is the same for the two
documents, I recommend that you process the comments for the two 
documents at the same time.

It will become clear to you as you read this review that I am not
overly impressed with the quality of this protocol specification. It is
possible that the issue lies with documentation details and the level of
review that the working group gave. However, I have not often seen a 
better example of why it is valuable to have interworking 
implementations of a protocol specification before requesting
publication.

As usual, all my comments are open for disagreement and discussion. I
think that you will probably want to produce a revision of this 
document to address my comments, so I have put it into "Revised I-D
Needed" state in the data tracker.

Thanks for the work,
Adrian

===

Please update the captions to the figures as "Figure n : blah, blah"

---

Section 1

   It provides an auxiliary logical data channel
   associated with an MPLS Label Switched Path (LSP), a pseudowire, or a
   section (link) over which a variety of protocols may flow.

Marginally ambiguous because of clause ordering. It is not "a section
over which a variety of protocols may flow." Suggest re-wording...

   It provides an auxiliary logical data channel over which a variety of
   protocols may flow. Each such data channel is associated with an MPLS
   Label Switched Path (LSP), a pseudowire, or a section (link).

---

Section 1

   ...Operations, Administration, and Maintenance (OAM)
   capabilities associated with the underlying LSP, pseudowire, or
   section.

I don't think "underlying" is right, is it?
How about:

   ...Operations, Administration, and Maintenance (OAM)
   capabilities for the associated LSP, pseudowire, or section.

---

Section 1

   The main principle guiding the design of the MPLS G-ACh advertisement
   protocol (GAP) is simplicity.

Capitalise, please.

---

Section 1.1

After...
   The G-ACh advertisement protocol presented in this document
   thus allows LSRs to exchange information of a similar sort to that
   supported by LLDP for Ethernet links.
...could you please a simple sentence explaining why it was not enough
to provide a way to encapsulate LLDP in the G-ACh. [Hint: I think the
answer is that you envisage the GAP being used for a larger set of 
MPLS-specific features than is currently defined in LLDP and for which
you do not consider it would be appropriate to extend LLDP.]

This is particularly important given that:
- the only use of the GAP mentioned anywhere is [I-D.ietf-mpls-tp-
  ethernet-addressing]
- you say...
   Where
   it is anticipated that the sole purpose of the GAP will be to provide
   Ethernet MAC address learning, the use of LLDP SHOULD be considered.
>From where I am sitting, that looks like "This protocol is not needed
because LLDP does everything we want."

So, the question you need to answer (for me and in the I-D) is why have
you invented a new protocol when LLDP does everything that is needed?

And, BTW, I would like to understand the meaning of "SHOULD" in a 2119
context in this document. Since it is a protocol spec, and since it
references 2119, the implication is that implementations of 
[I-D.ietf-mpls-tp-ethernet-addressing] should not be made unless 
another I-D using the GAP is published as an RFC *and* implemented by
the same implementation. 

Please don't think that deleting this text will get you off the hook!
You have already made the point (and with WG consensus). Now you need to
explain to me why there is a need for either document.    

I guess you should also say why you consider LMP to be inappropriate.

---

It is not really necessary to include LSR in Section 1.2 as the term
is only used before that section.

---

Section 2

   Although one GAP message can contain data for several applications,
   the receiver maintains the data associated with each application
   separately. This enables the sender to transmit a targeted update
   that refreshes the data for a subset of applications without
   affecting the data of other applications.

How the receiver maintains the data is entirely an implementation    
issues. I suggest you rephrase this as:

   Each GAP message can contain data for several applications. A
   sender may transmit a targeted update that refreshes the data for a
   subset of applications without affecting the data of other
   applications sent on a previous message.
                                                           
---

Section 3

After "XXXX" please note "(TBD by IANA)" so that the RFC editor
spots this, correlates it with the IANA section, and updates it
accordingly.

In Section 9.1, please mark the value to be assigned as XXXX
                                                                               
---

Section 3

s/A Gap message/A GAP message/

---

Section 3

Can two ADB elements for the same application be present in the
same message? If so what processing rules apply? If not, how is the
error handled?

Can two TLVs of the same type be present in the same ADB element? If
so what processing rules apply? If not, how is the error handled?

And the combination of cases, if two ADB elements for the same
application can be present in the same message, can a TLV of the same
type be present in each? If so what processing rules apply? If not,
how is the error handled?

---

Section 3 and 5

Although Section 5.1 tells us how the Message Identifier can be set, it
is ambiguous wrt Section 3 since in Section 3 we have:

      Message Identifier: Unique identifier of this message

And in 5.1 we have:

   The Message Identifier (MI) uniquely identifies this message and is
   set at the sender's discretion.

What does "at the sender's discretion" mean? It seems to say: "If the 
sender doesn't fancy setting it, that's OK."

Additionally, it is not clear what the uniqueness scope of the Message
Identifier is supposed to be. Presumably not globally unique! Is it per
sender or per data-channel per sender?
                                 
The sole purpose of the MI seems to be contained in the final paragraph 
of Section 5.1. If that is the case, you seem to have some confusion 
between rapid retransmission (same message) and lifetime avoiding
retransmission (different message). All this needs to be cleaned up and
explained. Noting that the last sentence of 5.1 belongs in 5.2.

I have to wonder whether you really need the field.

---

Sections 2, 3, and 5

   The Lifetime field specifies how long,
   in seconds, the receiver should retain the data in this message.  If
   the lifetime is zero the data is immediately marked as expired.

Have you considered offering a lifetime value meaning "retain until    
replaced"? This would *considerably* simplify the job of the sender 
which would not need to run multiple timers to ensure that the 
information is updated in a timely manner and doesn't accidentally
expire.

Note also that 2*16 seconds is not a very long time. things like the
MAC address of an interface can safely be considered "stable".

In rather a topsy-turvy way you say in Section 5.1...
   Lifetimes SHOULD be set in such a way that at least three updates
   will be sent prior to Lifetime expiration.  For example, if updates
   are sent at least every 60 seconds, a Lifetime of 185 seconds may be
   used.
...Isn't it the other way around? Shouldn't updates be sent according to
the lifetime value that has been set?

Discarding the information when the timer expires needs to be clarified.
I think you mean "revert to the configuration-based state that was held 
before the information was received." But you might mean "transition to
the complete absence of this information."

In Section 2 should you discuss message loss and the consequences? Any
mitigation you need to add? Is the retransmission to protect against 
lifetime expiry supposed to handle this? If so, it doesn't seem
frequent enough to install the initial state? Do you need a two-speed
timer: rapid on first use and dropping down to slow for state retention?
Or would an Ack be easier? Probably, the final paragraph of Section 5.1
can be adapted to handle this case, and then you can put a forward 
pointer in Section 2.

But note that the final paragraph of 5.1 is a bit of a mess!...
   In some cases additional reliability may be desired for the delivery
   of a GAP message.  When this is the case, the RECOMMENDED procedure
   is to send three instances of the message in succession, separated by
   a delay appropriate to the application.  This procedure SHOULD be
   used, if at all, only for messages that are in some sense
   exceptional; for example when sending a flush instruction following
   device reset.  The MI may be used to detect and discard duplicate
   messages.
...What is a "delay appropriate to the application?" Maybe it means 
"Each specification of a GAP application must state the appropriate
delay to use for such fast retransmission." Which would be fine, but
would require this document to make the statement for Application 0x0000
and would require [] to make a statement for Application 0x0001.
... Also "The procedure SHOULD be used..." reads like the procedure is
intended to be used. But "...if at all" really means "The procedure 
SHOULD NOT be used except..."

I can't understand what is meant by "If the lifetime is zero the data is
immediately marked as expired." Are you trying to say that the data 
should be processed, potentially replacing any stored information, and
then should be "discarded" (noting my previous comment about "discarded"
not being clear)? Or are you trying to say, that the ADB element should
be discarded unprocessed?

Lastly, the recommended multiplier ("at least three updates will be sent
prior to Lifetime expiration") cuts it too fine! You need to allow for
out-queues, transmission, and processing. The normal approach to this 
(probably going back to Van Jacobson) is to add a unit of half. Thus,
in your case, Lifetime = 3.5 * Update.

---

Section 3

Are TLV Value fields padded up to a multiple of 4 octets?

---

Section 3

It is clear that each ADB element must contain at least one TLV. What 
is the error processing for an empty ADB element?

---

Section 4

Since the Application ID 0x0000 ADB contains "metadata and processing
instructions rather than static data that is meant to be retained" 
please explain how the Lifetime field is to be set/interpreted.

---

Does the TLV in Section 4.1 support the identifiers described in 
[I-D.ietf-mpls-tp-itu-t-identifiers]? The chain of references for
"non-IP MPLS-TP networks" appears to lead through RFC 6428 to RFC
6370 which (I think) does not include the non-IP identifiers.

I see that you are adding new values to the Address Family registry in
Section 9.2. It is not clear to me where the definitive reference for
those types live. I would like to see more clarity in the text about
which address (and where it is specified) is intended for each new
code point that you are asking IANA to allocate.

---

Section 2 says that this is a one-way protocol, however, Section 4.2
defines an Application ID 0x0000 TLV that forms part of a two-way
exchange and is actually a direct request for information. 

Additionally, it seems a little odd to hide this in a TLV when what it
really is is a separate message type. By placing it in the TLVs you 
open up the prospect of message thrash as two implementations send
each other updates without clearing out this TLV.

The problem might be amusingly exacerbated by placing 0x0000 in the
list of Application ID for which a refresh is requested.

In any case, please explicitly clarify that setting a length of zero 
intends retransmission of the 0x0000 Application data.

---

Section 4.3

Does the flush apply to the 0x0000 Application Data previously sent?

Why do you not allow selective flushing per Application?

---

Section 4.4

   The request is strictly advisory: the
   receiver SHOULD accept and act on the request, but MAY override it at
   any time.

The "SHOULD" that you have used is stronger than "advisory". What is 
more you don't mean "advisory" you mean "a request".

I suggest you change this to

   ...the receiver MAY accept and act on the request, MAY ignore the 
   request, or MAY resume transmissions at any time according to 
   implementation or configuration choices, and depending on local
   pragmatics.

This section is another example of a TLV that violates the statement
in Section 2 that this is a one-way protocol.

Lastly, you need to say what it means if the Duration leaves the 
information as expired (i.e. the Duration takes us beyond the end of
the Lifetime). Obviously the sender (of the original information) can
decide that that would be a bad thing and retransmit anyway, but it is
not clear what the receiver (of the information) would do if the
transmission is not made.

Frankly, however, it is not clear why you have this TLV. Having gone
through the hoops of deciding that you need to be able to send the 
data, that the data can time out, and that you need to retransmit the
data, I think you need to give better motivation for this TLV.

---

I wish Sections 4.1 through 4.5 would end up containing the actual 
TLV Type codes to be used for each TLV. This could be achieved by
simply including the values in the figures. (This is safe because 
these are the first allocations from the new registry and there is
no question of IANA not allocating them as requested).

---

5.1 I like that operation of GAP SHOULD be configurable per data
channel. But you appear to have written "SHALL".  This precludes 
making an implementation that automatically and always runs GAP. 
(I think it still allows implementations that don't support GAP at
all because they don't implement this spec).

---

Paragraph 3 of section 5.2 talks about retaining objects for 
application data that can be handled. But this is implementation-
specific. I think it is only the information that has to be retained.
Try to talk about function rather than constrain implementation.

---

The last paragraph of 5.2 is a bit confusing.

   The receiver MAY make use of the application data contained in a GAP
   message to perform some level of autoconfiguration, for example if
   the application is an OAM protocol.  The implementation SHOULD,
   however, take care to prevent cases of oscillation resulting from
   each endpoint attempting to adjust its configuration to match the
   other.  Any such autoconfiguration based on GAP information MUST be
   disabled by default.

I *think* you are really trying to place constraints on future
specifications rather than on implementations (there is nothing in this
specification that could be used for auto-conf). But I note that
[I-D.ietf-mpls-tp-ethernet-addressing] looks like auto-conf and is not
disabled by default.

---

In Section 6.1, you should discuss the operation of key exchange 
protocols in the absence of IP protocols in the network. After all, one
of the use-cases (oh, the only use case :-) for GAP is to operate in
networks that don't have IP available. You might also note that GAP is
described as a protocol for exchanging capabilities and configuration:
keys might be described by some as exactly such information.

If your fall-back is that keys have to be manually configured in the
absence of IP, then you desperately need to call this out in 
[I-D.ietf-mpls-tp-ethernet-addressing] because manual configuration of
keys seems a whole lot harder than manual configuration of MAC
addresses.

---

Section 6.1

I believe that RFC 4107 says that you need to discuss mechanisms for
dynamic key update, to make a recommendation as to whether such 
mechanisms are needed, and to state how often keys need to be updated.

---

The problem I see with processing of the Timestamp described in 6.2 is
that it does not allow for a sender correcting its clock. Can you add
text to help clarify how this case is handled.

---

The final paragraph of Section 6.2 is a bit of a throw-away. What you
are saying, I think is that "Full algorithm flexibility is provided by
the protocol's use of the Authentication Key Identifier which acts as 
an indirect reference to the algorithm and key data in use. Implementors
SHOULD consider the use of [I-D.ietf-karp-crypto-key-table] for key 
management."

You might move this text to the end of 6.1 where it will sit with the
discussion of the Key ID.

---

How do you distinguish "device discovery" in Section 7 from talking to
a device whose MAC address you don't know in [I-D.ietf-mpls-tp-ethernet-
addressing] ?

Very unclear which MAC address to use when.

---

Maybe Section 8 should point at Section 6 since Section 6 seems to be
all about Security mechanisms.

---
                                                                              
You have used [I-D.ietf-karp-crypto-key-table] in a normative way. Not 
a big risk as it is currently in KARP WG last call. Please move it to
Section 10.1.

---

Trivial point, but there are precisely three authors, all marked as 
editors. You could safely dispense with the "editor" designation (on
the front page and in the Authors' Addresses section).