Re: [mpls] AD review of draft-ietf-mpls-gach-adv - encore

"Adrian Farrel" <adrian@olddog.co.uk> Tue, 05 February 2013 22:41 UTC

Return-Path: <adrian@olddog.co.uk>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1557321F8939 for <mpls@ietfa.amsl.com>; Tue, 5 Feb 2013 14:41:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.572
X-Spam-Level:
X-Spam-Status: No, score=-2.572 tagged_above=-999 required=5 tests=[AWL=0.027, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BALl+pu-8dan for <mpls@ietfa.amsl.com>; Tue, 5 Feb 2013 14:41:34 -0800 (PST)
Received: from asmtp1.iomartmail.com (asmtp1.iomartmail.com [62.128.201.248]) by ietfa.amsl.com (Postfix) with ESMTP id 8ECC021F8934 for <mpls@ietf.org>; Tue, 5 Feb 2013 14:41:33 -0800 (PST)
Received: from asmtp1.iomartmail.com (localhost.localdomain [127.0.0.1]) by asmtp1.iomartmail.com (8.13.8/8.13.8) with ESMTP id r15MfShR032729; Tue, 5 Feb 2013 22:41:28 GMT
Received: from 950129200 (89-26-111-90.bruck.stat.salzburg-online.at [89.26.111.90]) (authenticated bits=0) by asmtp1.iomartmail.com (8.13.8/8.13.8) with ESMTP id r15MfP6R032709 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 5 Feb 2013 22:41:26 GMT
From: Adrian Farrel <adrian@olddog.co.uk>
To: stbryant@cisco.com
References: <01ec01cdf04b$1ee32af0$5ca980d0$@olddog.co.uk> <51114EC1.5030100@cisco.com>
In-Reply-To: <51114EC1.5030100@cisco.com>
Date: Tue, 05 Feb 2013 22:41:25 -0000
Message-ID: <027601ce03f1$eee46430$ccad2c90$@olddog.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
Thread-Index: AQIwrSV4jn7XBQrAgiYmYiDcjeHnQwHsGjGWl5bv2hA=
Content-Language: en-gb
Cc: mpls@ietf.org, draft-ietf-mpls-gach-adv.all@tools.ietf.org
Subject: Re: [mpls] AD review of draft-ietf-mpls-gach-adv - encore
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: adrian@olddog.co.uk
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mpls>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Feb 2013 22:41:35 -0000

Many thanks for the hard work that has gone into this.

A number of responses in line. I have tried to weed out the things where "I
would not have done it like that" and reduce to issues where I think the
document still needs work or where I am worried that the protocol may have
problems. The purpose of my comments is solely to make sure that the document is
solid, and that implementers (especially of future applications using GAP) will
know what is expected.

Cheers,
Adrian

> > Section 3

I raised questions about the presence of multiple instances of things like ADB
elements for the same application, TLVs, etc. 

You made some changes (e.g., allowing zero TLVs in an ADB element), but I don't
think you caught all the cases. I do believe that some of this is specific to
the core protocol and not an issue that can be devolved to the application.

Can I suggest:
OLD
   A GAP message consists of a fixed header followed by a GAP payload.
   The payload of a GAP message is an Application Data Block (ADB)
   consisting of one or more block elements.  Each block element
   contains an application identifier, a lifetime, and a series of TLV
   objects for the application it describes.
NEW
   A GAP message consists of a fixed header followed by a GAP payload.
   The payload of a GAP message is an Application Data Block (ADB)
   consisting of one or more block elements.  Each block element
   contains an application identifier, a lifetime, and a series of zero
   or more TLV objects for the application it describes.
END

NEW
   Malformed GAP messages MUST be discarded by the receiver,
   although an error MAY be logged.
END

OLD
   In this format, the Application ID identifies the application this
   element describes; an IANA registry has been created to track the
   values for this field.  The Element Length field specifies the total
   length in octets of this block element (including the Application ID
   and Element Length fields).
NEW
   In this format, the Application ID identifies the application this
   element describes; an IANA registry has been created to track the
   values for this field.  More than one block element with the same
   Application ID may be present in the same ADB, and block
   elements with different Application IDs may also be present in 
   the same ADB.

   The Element Length field specifies the total length in octets of
   this block element (including the Application ID and Element 
   Length fields).
END

NEW
   The protocol rules for the mechanism including what ADB
   elements are present and which TLVs are contained in an ADB
   element are to be defined in the document that specifies the 
   application-specific usage.
END

> > Sections 2, 3, and 5
[snip]
> > In rather a topsy-turvy way you say in Section 5.1...
> >     Lifetimes SHOULD be set in such a way that at least three updates
> >     will be sent prior to Lifetime expiration.  For example, if updates
> >     are sent at least every 60 seconds, a Lifetime of 185 seconds may be
> >     used.
> > ...Isn't it the other way around? Shouldn't updates be sent according to
> > the lifetime value that has been set?
> 
> In this protocol, the lifetime sets the death date, which is consistent
> with the biological definition, and is thus intuitive. Perhaps you are
> confusing lifetime with refresh time.

Well, obviously I'm confused. I am not sure this is important, but where you
have:
     Lifetimes SHOULD be set in such a way that at least three updates
     will be sent prior to Lifetime expiration.  For example, if updates
     are sent at least every 60 seconds, a Lifetime of at least 210 seconds
     would typically used.
I would have expected:
     Refresh times SHOULD be set in such a way that at least three updates
     will be sent prior to Lifetime expiration.  For example, if the Lifetime
     is set to 210 seconds, then updates should be sent at least once every 
     60 seconds.

It probably isn't important. Just looked really odd that lifetime was driven by
refresh interval, rather than refresh existing to keep things alive.

BTW, stylistically "at least every 60 seconds" may be confused with "60 seconds
is the least time you should use".
It's probably language mutating from "at least once every 60 seconds".

NB s/typically used/typically be used/

> > Discarding the information when the timer expires needs to be clarified.
> > I think you mean "revert to the configuration-based state that was held
> > before the information was received." But you might mean "transition to
> > the complete absence of this information."
> 
> This is application and configuration specific.

I can see why you say that, but later you said 

> The former, it is a way of saying forget the data. I.e. replace the data
> with nothing and forget you ever heard it.

I think you may have said too much when you said "a lifetime that informs the
receiver how long to wait before discarding the data for that application"

How about you lose the "discarding" business because this is the domain of the
application. And say...

   a lifetime value that informs the receiver how
   long the lifetime of the application data is. What 
   an application does after the end of the lifetime
   is application-specific.

Actually, you have introduced another instance of this into 5.1 with 

   A sender deletes previously sent data by using the TLV
   lifetime mechanism as previously described in Section 3 

I don't like this "delete" business because it is up to the application how it
reacts. Maybe:

  A sender can inform the receiver that the data it previously
  sent is no longer current by using the TLV lifetime mechanism 
  as previously described in Section 3.

> > In Section 2 should you discuss message loss and the consequences? Any
> > mitigation you need to add? Is the retransmission to protect against
> > lifetime expiry supposed to handle this? If so, it doesn't seem
> > frequent enough to install the initial state? Do you need a two-speed
> > timer: rapid on first use and dropping down to slow for state retention?
> > Or would an Ack be easier? Probably, the final paragraph of Section 5.1
> > can be adapted to handle this case, and then you can put a forward
> > pointer in Section 2.
> 
> Again this is application specific.

What I am trying to get at is what assumptions an application can make, and what
it has to do for itself.

Clearly you don't want to include an Ack in what is fundamentally a
unidirectional protocol. But you do need to call out for the application writer
what level of reliability they can expect. That feels like:

  In normal circumstances, GAP's refresh mechanism will 
  ensure the delivery of application data except when the
  connection is so poor that OAM will likely announce it as
  having failed. However, applications that require reliable
  delivery may need to take their own additional measures
  for example, using rapid retransmission as described in
  Section 5.1.

But section 5.1 still gives me a problem...

     In some cases additional reliability may be desired for the delivery
     of a GAP message.  When this is the case, the RECOMMENDED procedure
     is to send three instances of the message in succession, separated by
     a delay appropriate to the application.  This procedure SHOULD be
     used, if at all, only for messages that are in some sense
     exceptional; for example when sending a flush instruction following
     device reset.

Is the "three instances" of the message rapid retransmission of the same MI (in
which it is a GAP function and not an application function) or three messages
with identical content but different MI (in which case it is an application
function)? 

The mention of using the MI to detect duplicate messages implies that this is
GAP function. If so, how does the GAP implementation know when to send fast and
when not? Note that the application generates ADB elements not GAP messages!

Additionally "SHOULD only be used, if at all" is an inversion of 2119 language.
I think there is a SHOULD NOT or a MUST NOT that you can write instead to
achieve cleaner guidance.

> > Section 3
> >
> > Are TLV Value fields padded up to a multiple of 4 octets?
> 
> This is application specific and we have not seen a need to
> enforce data alignment. Remember that L is in bytes and
> that packets may get aligned to an arbitrary byte boundary
> by the datalink.

No, I am quite happy with not having data alignment. I just wanted you to make
it clear in the document.

How about:

OLD
   The Length field specifies the length in octets of the
   Value field.
NEW
   The Length field specifies the length in octets of the
   Value field and may be any number of octets from 
   0 to xxxx
END

The value of xxxx is amusing, isn't it?

> > Section 4.4
> >
> >     The request is strictly advisory: the
> >     receiver SHOULD accept and act on the request, but MAY override it at
> >     any time.
> >
> > The "SHOULD" that you have used is stronger than "advisory". What is
> > more you don't mean "advisory" you mean "a request".
> >
> > I suggest you change this to
> >
> >     ...the receiver MAY accept and act on the request, MAY ignore the
> >     request, or MAY resume transmissions at any time according to
> >     implementation or configuration choices, and depending on local
> >     pragmatics.
> 
> We think that the texts are equivalent and the intent is clear.

I don't mean to appear difficult, but I find them very different. Some of my
issues are:

How can a request be advisory? I am just advising you that I am making a
request? How is "SHOULD" aligned with "advisory"? And under what circumstances
is the "MAY" allowed? If you really mean "at any time" then surely it is not a
"SHOULD".

Perhaps if you think the texts are equivalent you won't mind changing them?

> > In Section 6.1, you should discuss the operation of key exchange
> > protocols in the absence of IP protocols in the network.
> 
> We are are not aware of such a protocol.

This is precisely my point. It is very important for how this protocol can be
secured in the long term with the need for key exchange in the absence of IP
protocols. And to be clear, there *are* proposed use cases in the absence of IP.

Now, you may claim that having a pool of keys already configured (see 6.1) is
enough. And you might claim that key rotation does not need to be dynamic. And
all of that comes into my comment below about the need to explicitly address RFC
4107. But either you have to say that stuff, or you have to point out that
operation in the absence of IP reduces key agility.

> > Section 6.1
> >
> > I believe that RFC 4107 says that you need to discuss mechanisms for
> > dynamic key update, to make a recommendation as to whether such
> > mechanisms are needed, and to state how often keys need to be updated.
> 
> The first para of 6.1 says:
> 
> Multiple Key IDs may be active on the sending and receiving nodes
> simultaneously, in which case the sender locally selects a Key ID from
> this set to use in an outbound message. This capability facilitates key
> migration in the network.

Yes it does.
And 4107 goes a step further by asking:
How often do keys need to be migrated? (Answer "very infrequently" may be
sufficient)
How are new Key IDs introduced? (Answer "by manual configuration" may be
sufficient)

My observation is not to criticise what you have in the protocol, but to get you
to call out what you have and say that you are happy with it (and probably why
everyone else should be happy).