[NSIS] AD review: draft-ietf-nsis-ntlp-09

Hi,

I have finally finished my review of GIST. I do have a number of 
comments and question with potentially serious implications. So lets 
start with them. Please remember that I have only read it once plus some 
checks of it when writing up things. So there might be cases when a 
simple reference to where things are defined can resolve the issue.

I will also be on vacation 3rd-11th of June so do not expect any 
response from me during this period.

Major issues
------------

M1. Congestion Control

This affects text not only in 5.3.3 but also 7.1.3 and possibly other 
places. But I do have a general concern that the congestion control 
measurements described in the specification is underspecified.

First in 5.3.3 I don't see any normative minimal values, or even 
recommended values for T1 and T2 that will be safe to deploy on the 
internet. I don't find it acceptable that the developer needs to 
investigate which values that are safe to use and which are not. There 
should also be some criteria documented for when it is acceptable to go 
beyond these values. So my concern here is that the retransmission runs 
havoc and create way to many packets to be sent.

Secondly, there are also no values documented for the rate control. I 
think it is necessary to document what internet safe values are here so 
that one does not cause problems. In addition is seems a bit simplistic 
to use a token bucket with some parameters selected based on the local 
link as GIST clearly sends messages beyond the local link. Thus one 
might have to consider being a bit smarter and more adaptive to what is 
seen for different flows.

Third, the implication and congestion issues with local repair seems to 
have been brushed over. Section 7.1.3 do indicate that you need to take 
care, but nothing more. Are there some potential for aggregation of the 
queries to minimize the load and have quicker convergence?

M2. Security functions for d-mode

When reading the security consideration and the specification on 
security properties I get a bit frightened that there are no 
authentication mechanism specified for d-mode. It seems that there 
should be need for authentication of d-mode queries etc to ensure that 
the a NSIS node is talking to someone that really has the right to 
perform the query and association creation. Not having such a mechanism 
seems to prevent building a system where the domains edge nodes handles 
the primary security and the internal nodes only accept GIST messages 
that are authenticated by nodes belonging to the domain.

Section 8.6: I also am concerned that if there exist no secured MA 
between two GIST nodes, a on-path Man in the middle can prevent there 
from ever existing a secured MA.

M3. NAT traversal for GIST.
Section 7.2:

What plans are there for clearly defining NAT behavior for GIST and the 
relevant NSLPs? I think one should avoid repeating the mistake with NATs 
of not specifying the behavior sufficiently for NSIS. Thus I think there 
is need to specify in great deal what is going to happen at the translation.

M4. IANA consideration section

I think the IANA consideration section is lacking in several ways.

A. Decision criteria for IESG, Expert review or other policies are not 
written. The WG should provide some guidelines for what is acceptable to 
be registered and what is not.

B. There is no explanation for the large blocks of reserved codes that 
exist.

C. No rules for what is required to be included in an application for 
assignment in any of these spaces.

D. The NSLP ID policy. Is it really envisioned that GIST will only be 
used for NSLPs for which it is suitable that the IESG make decision on 
them? Wouldn't a better policy be expert review or something which is 
having a bit of lower bar. Having all proprietary users of GIST run to 
the IESG for approval seems strange.

Minor Issues
------------

L1. The lack of a collected ABNF makes it hard to verify. Has the 
present ABNF been run through any of the syntax checkers available at:
http://tools.ietf.org/inventory/verif-tools ?

L2. Section 3.3, page 12, second paragraph:
"  In addition, it should be noted that NAT traversal almost certainly
    requires transformation of the MRI field in GIST messages (see
    Section 7.2).  Although the transformation does not have to be
    defined as part of the standard, the impact on existing GIST-aware
    NAT implementations should be considered."

It seems that not defining the NAT behavior for GIST can potentially 
cause interop problems and make it very hard to use. What is the plans 
for addressing this? Are there already existing GIST-aware NATs?

L3. Section 4.1.2:
"Reliability: This attribute may be 'true' or 'false'.  For the case
       'true', messages MUST be delivered to the signaling application in
       the peer exactly once or not at all; if there is a chance that the
       message was not delivered, an error MUST be indicated to the local
       signaling application identifying the routing information for the
       message in question."

How does one evaluate "if there is a chance that the message was not 
delivered"? I think there is need for some clear definition on when the 
normative requirement applies.

L4. Have anyone considered if one can run draft-ietf-pmtud-method-06 
style of MTU discovery with GIST D-mode?

L5. Section 5.1, page 33:
"It MUST echo the
    MRI (with inverted direction), SID, and Responder-Cookie if the
    Response carried one;"

I don't think "with inverted direction" is really clear here. What does 
it mean to invert the direction of the MRI?

L6. Section 5.2.1 and A.1:

I have some problems with that Section 5.2.1 first declares that these 
fields are present, while not defining them fully. Like nr of bits etc. 
Then section A.1 is also not a complete definition of the fields in the 
message. I think doing an informative reference in 5.2.1 is okay, and 
then actually have a complete definition in A.1.

I would also like to point out that unless you always write out the 
number of bits a field have, and the fields in the order they appear in 
the graph you will confuse people reading it with text to speech 
synthesis, like Sam Hartman.

Do consider that these comments applies to also other protocol elements 
defined in appendix A, such as A.3.3.

L7. Section 5.2.2:
"If the message is downstream, the IP-TTL MUST be set to the TTL
          that will be set in the IP header for the message (if this can
          be determined), or else set to 0."

How is a message "downstream" I think you need to specify if this 
relates being sent in the downstream direction.

L8. Section 5.3.2:

"   GIST may send messages addressed as {flow sender, flow receiver}
    which could make their way to the flow receiver even if that receiver
    were GIST-unaware.  These should be rejected (with an ICMP message)
    rather than delivered to the user application (which would be unable
    to use the source address to identify it as not being part of the
    normal data flow).  Therefore, a "well-known" port is required.
"

I am having difficulties understanding how the end-peer could make a 
difference between GIST messages going the whole way to the end-peer per 
design and those that should be rejected by the end-peer. Please explain 
what the intention here is.

L9. Section 5.4.1: DCCP:
The note on DCCP forgets one important factor. DCCP does not provide any 
reliability.

L10. Section 5.7.1:
"However, the object
    contents MUST be retained only for the duration of the Query/Response
    exchange and any following association setup, and afterwards
    discarded."

Is "afterwards" only to referring to completed or failed association setup?

L11. Section 5.7.1:
"The responding node MUST verify this to ensure that no bidding-down
    attack has occurred; see Section 8.6 for further discussion."

As section 8.6 states this is not sufficient for detecting a on-path man 
in the middle attack. So I think the wording the the above sentence is 
to strong.

L12. Section 5.7.2:

What address is used in the TCP connection. The port is specified but 
not the address.

L13. Section 5.7.3:

Why is TLS 1.0 made mandatory and not TLS 1.1?

L14. Section 5.8.1.2:
"In this case, it MUST use the signaling
    source address for the IP source address in order to receive the ICMP
    error."

How can a sender of message sending a query using d-mode with the flow 
source address determine that it is a to small MTU that causes the 
messages to be lost, rather than anything else?  I think the use of MUST 
in this sentence is wrong. Is it something more than a informative 
statement saying: To be able to receive ICMP messages the message sender 
needs to use his own address.

L15. Section 5.8.1.3:

Can someone please explain why an upstream request which has traversed a 
non-GIST aware router can be trusted less than a down stream one?

L16. Section 6.4, Rule 4:

    Rule 4:

    send MA-Hello message
    restart NoHello timer

why is the timeout of a send Hallo message timer reseting the NoHello 
timer? Shouldn't the NoHello timer be only based on receiving Hello 
messages from the other side?

Also what is the recommended Hello frequency? It would be good to have 
some discussion on what reasonable values are. This is also part of the 
congestion discussion for GIST.

Section 7.1.2: GIST probing:

One more missing recommendation on values for protocol functionality.

L17. Section 7.1.4:

Bullet 1: "The signaling
        application at E1 MAY begin local repair immediately, or MAY
        propagate a notification upstream to D1 that re-routing has
        occurred."

There seems to be a interaction between GIST and the signalling 
application about the notification on the change. Are there any 
recommendations on signalling application implementing the necessary 
semantics to perform that notification?

L18. Section 7.2, page 73, last paragraph:

    A NAT will intercept datagram mode messages with the normal
    encapsulation containing such echoed NAT-Traversal objects.

It is not obvious that a NAT will do this for d-mode. Isn't a NAT which 
is not made GIST aware simply going to translate the IP/UDP headers and 
forward the packet despite the router alert.

L19. Section 7.2, page 74:

"Note that Confirm messages are not translated."

Why isn't them? Also where is that specified? And what is really meant 
with "not translated"?

L20. Section A.2:
The "r" bits are not explicitly written to fall under the rules in the 
last paragraph of that section.

L21. Section A.2.1:

What is the treatment of the "reserverd" AB=11 for a GIST node?

L22. Section A.3.1:

The "MRM-ID" field is not having the same name as in the IANA 
consideration section.

L23. Section A.3.1.2:
"D - Direction (always 0 for "downstream")"

The use of "always" in the above make it unclear what is meant.

L24. Section A.3.8:
"the number of GIST payloads translated by the NAT"

The NAT? There might be several NATs on the path and it is not clear how 
that is handled. One concern would be if different NATs translate 
different fields.

L25. Section 4.1

The Info Count Field and Info fields. I think the rules regarding 
ordering of these are to weak. As the order and existence of fields are 
dependent on the error codes and sub-error code,  it is important to 
clarify that with normative strength. So please state the rules that 
applies for resolving the additional data fields and for each error code 
be clearer on the additional data field. Being explicit in stating none 
for those that have none is a good idea.

Nits
----

N1. There is quite a list of abbreviation used in this document without 
being spelled out at their first usage. Please correct. I have spotted 
the following ones:
- NSLPID
- RAO
- SCD
- N-SM
- C-mode
- Q-mode

N2. Section 3.4, last paragraph:

"A consequence of this
    design is that signaling applications should choose SIDs so that they
    are cryptographically random, and should not use several SIDs for the
    same flow unless strictly necessary, to avoid additional load from
    routing state maintenance."

I think there is need for a reference to a document on how to achieve 
cryptographically random numbers. Is RFC 4086 a useful ref?

N3. section 6.1:

"   Rule 4 (rx_Data):

    if (node policy will only process Data messages with matching
        routing state) then
      send "No Routing State" error message
    else
      pass directly to NSLP"

It seems to me that the expression within the IF is wrongly defined. I 
think it should read:

if (routing state exist) then
    pass to NSLP
else
    Send error msg.

N4. figure 7: The established to established arrow has 
"[!confirmRequired]" which is a bit confusing on what it applies to of 
the above cases.

N5. Section 7.1.2:
Please include a reference for OSPF.

N6. Section 7.2:

It uses "public" and "private" when referring to the different sides of 
a NAT. In BEHAVE WG they are using External and Internal instead. The 
reason is due to that the external side of a NAT may still be a private 
address space.

N7. Section 11:
Ref 6 and 8 needs to be updated

N8. Section A.3.5:
"MUST not" -> "MUST NOT"

N9. Section A.4.4.9: subcode 3:

"Invalid Object" -> "Invalid Object Type"?

Cheers

Magnus Westerlund

Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund@ericsson.com

_______________________________________________
nsis mailing list
nsis@ietf.org
https://www1.ietf.org/mailman/listinfo/nsis