Re: Gen-ART LC review of draft-ietf-ccamp-inter-domain-rsvp-te-06.txt

"Adrian Farrel" <adrian@olddog.co.uk> Thu, 16 August 2007 19:05 UTC

Message-ID: <042601c7e036$e0ed1c80$0300a8c0@your029b8cecfe>
Reply-To: Adrian Farrel <adrian@olddog.co.uk>
From: Adrian Farrel <adrian@olddog.co.uk>
To: "Eric Gray (LO/EUS)" <eric.gray@ericsson.com>, JP Vasseur <jvasseur@cisco.com>, Arthi Ayyangar <arthi@nuovasystems.com>
Cc: Ross Callon <rcallon@juniper.net>, gen-art@ietf.org, "Brungard, Deborah A, ALABS" <dbrungard@att.com>, ccamp@ops.ietf.org
References: <941D5DCD8C42014FAF70FB7424686DCF01696137@eusrcmw721.eamcs.ericsson.se>
Subject: Re: Gen-ART LC review of draft-ietf-ccamp-inter-domain-rsvp-te-06.txt
Date: Thu, 16 Aug 2007 19:52:52 +0100
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="iso-8859-1"; reply-type="original"
Content-Transfer-Encoding: 7bit
Sender: owner-ccamp@ops.ietf.org
Precedence: bulk

Hi Eric,

Many thanks.

> Summary:
> =======
>
> This document is not ready for publishing as a Proposed Standard.
> Some points require clarifications.
>
> Comments:
> ========
>
> The following paragraph (section 2) seems inconsistent with the
> text in subsequent sections: it seems that it is saying that any
> combination of the three methods (contiguous, nested, stitched)
> could be used in a _single_ LSP - yet at least on of these is
> defined as "end-to-end" in subsequent text (contiguous).

[SNIP]

Point taken.

What we were trying to convey is that, domain by domain, you may apply any 
one of the techniques.

We can re-work this text to make it clearer and possibly include a (rather 
trite) example.

> --------------------------------------------------------------------
> Still related to the above, but as a side note, the fact that it
> is not immediately clear which of these is the intention in this
> document is an indication of a potential for some very serious
> interoperability issues.  Some of the subsequent text mentions
> the potential for a "policy failure" that SHOULD fail the setup.
>
> In addition to not being absolutely clear (when would it be okay
> not to fail, is there a way to indicate that an alternative is
> allowed, etc?), this implies that it is possible for a domain to
> have a policy (for example, against contiguous LSP setup) that is
> going to result in complete non-interoperation with a domain (or
> implementation) asking for contiguous LSP setup.
>
> This sort of policy conflict may easily arise if one implementer,
> or operator, interprets the specification one way and another (or
> others) interprets it another in terms of expected "end to end"
> behavior and combinations of the types of "spanning" LSPs.

The ability to fail to interoperate at the domain level is intended to be an 
operator choice. If, for whatever reason, the operator only wants to allow 
one of the mechanisms, and the ingress domain insists on another mechanism, 
then it is important to provide for the rejection of the LSP setup.

The most obvious (to me) example, is that an ingress may request a 
contiguous LSP because it wishes to exert maximal control over the LSP's 
path and (in particular) to control when re-optimization takes place. OTOH, 
the operator of a transit domain may decide that only (for example) LSP 
stitching is allowed for exactly the reason that it gives the operator the 
chance to reoptimize their own domain under their own control. Now there is 
a stand-off. The ingress operator has a choice:
- find another path
- relax his requirements
- fail to provide the service.

But the point about implementation is more important.
The choice is not very functional if the ingress LSR is not able to support 
all modes. Fortunately, that is easy to implement (it is only a flag).
The choice is, however, fully functional if the LSRs at the borders of the 
transit domain only support one mechanism (since that must be the mechanism 
that the domain operator is deploying).

We can clarify this choice, but I don;t believe any functional change is 
required.

> ---------------------------------------------------------------------
> The last two paragraphs in section 2 (preceding the heading for
> section 2.1) are confusing and (potentially) contradictory: one
> appears to say that signaling extensions (for control/selection
> of methods) are in scope while specific details are not, but the
> other seems to say otherwise.
>
> I suspect that the word "control" is where I am getting hung-up.
> If it is intended that the specifics of each method are out of
> scope in this document - once a method is selected - then it is
> hard to see how "control [...] of the three signaling mechanisms"
> is in scope but "specific protocol extensions required to signal
> each LSP type" is not.
>
> Perhaps "control and" should be removed?

I think the issue is with the associativity of "and".

We mean:
   This document describes the RSVP-TE signaling extensions
   required to control which of the three signaling mechanisms
   is used and to select which of the three signaling mechanisms
   is used.

I agree that "control and" is superfluous.

> ----------------------------------------------------------------------
> In section 3, under bullet 4a - on page 6 - you use "SHOULD" in
> saying whether or not a PathErr message is to be generated.  Why
> - or under what circumstances - would it be appropriate not to do
> so?

Hmmm. RFC 3209 includes a variety of reasons to generate a PathErr carrying 
the "Routing Problem" error. Some of these are related to path computation 
failure (including next hop selection), and the referenced text relates to 
path computation failure. We do not feel that we can redefine (or justify!) 
the behavior described in RFC 3209. In addition, the referenced I-D 
([INTER-DOMAIN-PD-PATH-COMP], also describes this behavior - and needs to be 
fixed to include why this is a "MAY".

My opinion is that at a domain boundary we should make the same 
considerations as in RFC 4208 sections 3.1 and 3.2 (note the error in 3.2 
with lower case "should"). The reasoning here, I believe, is that a Path 
message could be interpretted as a network probe, and a PathErr provides 
information about the network capabilities and policies. Therefore, it may 
be an administrative/security policy to not send a PathErr under certain 
routing failure conditions.

We should clarify this six (count them!) ways:
- reference to 3209
- reference to [INTER-DOMAIN-PD-PATH-COMP]
- explanation of when an application MAY decide to not send PathErr
- text in the security section
- explanation in [INTER-DOMAIN-PD-PATH-COMP] of when an
  application MAY decide to not send PathErr
- text in the security section of [INTER-DOMAIN-PD-PATH-COMP]

> ----------------------------------------------------------------------
> Similarly, on page 7, you use the word SHOULD with respect to ERO
> processing in a Path message.  When (or under what circumstances)
> would it make sense not to do as specified?  What alternatives are
> there?
>
> This same observation then applies to several (all?) of the actual
> procedures - although several of them do provide some degree of
> amplification as to when (or why) a specific step might not apply.

I think that the numbered paragraphs need to be updated as:
1. first SHOULD -> MUST; second SHOULD remains with added explanation
2. SHOULD -> MUST
3. SHOULD remains with added explanation
6. SHOULD remains with added explanation

The changes of SHOULD to MUST need consultation with my co-authors and with 
the WG.

> Perhaps it would be best to clarify in the steps themselves and -
> rather than say "SHOULD" in the introductory statement, it might
> be enough to simply say that the following procedures apply (thus
> avoiding the issue of using either SHOULD or MUST).

I agree. The meta-paragraph (before the numbered points) doesn't make sense 
since "SHOULD do a MUST" is meaningless. We should drop the 2109 language 
from that text.

----------------------------------------------------------------------
> After "However:" in section 3.2, I would suggest some form of
> modification or qualification of the first bullet similar to:
>
> - A domain border node MUST NOT passively suppress propagation
>  of a PathErr message.
>
> Clearly, if the device applies a successful crankback approach,
> it does not make sense for it to propagate the PathErr anyway.

OK. This was obviously not sufficiently clear.
It was meant to imply that the PathErr must
- either be propagated
- or dropped because some other action has been applied
In other words, a PathErr must not simply be dropped according to policy, 
whim, or misadventure.
(Actually, misadventure can occur because of the nature of IP :-)

We can try to expan the text to make "passively suppress" less opaque.

> ----------------------------------------------------------------------
> In the third bullet of section 7, page 15, "do not need to be
> considered" does not follow from "are likely to be upgraded".
>
> True - if they are upgraded - then they are unlikely to need
> to be considered in backward compatibility discussion.  But
> "likely to be" is NOT the same as "are".
>
> The point I think you are trying to make is that upgrading a
> border LSR to include this capability is an essential enabling
> requirement for useful operation of this capability - and that
> makes it unnecessary to consider such upgraded border LSRs in
> backward compatibility.

Yes, that is exactly the point.
Of course, it is possible that someone will try to exercise the function 
without upgrading the domain border LSRs, but then it wouldn't work, and is 
not expected to work.

We will try to polish the text.

======================================================================
> NITs:

[SNIP]

> In section 3.1, as currently formatted, the "factors including:"
> seems to be "Farrel, Ayyangar and Vasseur" - an artifact of the
> way that your document only has a single empty text-line between
> the last line of content-text and the page footer.  This is only
> a problem when reading the electronic version and - most likely -
> a (humorous) temporary problem that will fall out in subsequent
> versions.  If not, it might be a good idea to force the line with
> a colon ending it to not orphan the next line of content-text.
>
> In the version I read, this occurs at the bottom of page 6.

How do you know this is not how we intended it to be read?

[SNIP]

Very many thanks.
Adrian

Re: Gen-ART LC review of draft-ietf-ccamp-inter-d… Adrian Farrel