Re: Routing Directorate comments on draft-ietf-ccamp-automesh-01

JP Vasseur <jvasseur@cisco.com> Fri, 22 September 2006 13:38 UTC

Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GQlEb-0005oj-ST for ccamp-archive@ietf.org; Fri, 22 Sep 2006 09:38:29 -0400
Received: from psg.com ([147.28.0.62]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GQlEX-0005Ti-U4 for ccamp-archive@ietf.org; Fri, 22 Sep 2006 09:38:29 -0400
Received: from majordom by psg.com with local (Exim 4.60 (FreeBSD)) (envelope-from <owner-ccamp@ops.ietf.org>) id 1GQkzo-0004G7-2r for ccamp-data@psg.com; Fri, 22 Sep 2006 13:23:12 +0000
X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on psg.com
X-Spam-Level:
X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00, DNS_FROM_RFC_ABUSE,HTML_MESSAGE,SPF_PASS autolearn=no version=3.1.1
Received: from [171.68.10.86] (helo=sj-iport-4.cisco.com) by psg.com with esmtp (Exim 4.60 (FreeBSD)) (envelope-from <jvasseur@cisco.com>) id 1GQkzj-0004FL-RP for ccamp@ops.ietf.org; Fri, 22 Sep 2006 13:23:08 +0000
Received: from sj-dkim-7.cisco.com ([171.68.10.88]) by sj-iport-4.cisco.com with ESMTP; 22 Sep 2006 06:23:07 -0700
X-IronPort-AV: i="4.09,202,1157353200"; d="scan'208,217"; a="1854876641:sNHT87124500"
Received: from sj-core-3.cisco.com (sj-core-3.cisco.com [171.68.223.137]) by sj-dkim-7.cisco.com (8.12.11.20060308/8.12.11) with ESMTP id k8MDN7Yj027888; Fri, 22 Sep 2006 06:23:07 -0700
Received: from xbh-rtp-201.amer.cisco.com (xbh-rtp-201.cisco.com [64.102.31.12]) by sj-core-3.cisco.com (8.12.10/8.12.6) with ESMTP id k8MDMxMP025279; Fri, 22 Sep 2006 06:23:06 -0700 (PDT)
Received: from xfe-rtp-202.amer.cisco.com ([64.102.31.21]) by xbh-rtp-201.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Fri, 22 Sep 2006 09:23:02 -0400
Received: from [10.86.104.178] ([10.86.104.178]) by xfe-rtp-202.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Fri, 22 Sep 2006 09:23:00 -0400
In-Reply-To: <0f3f01c6dd9d$e30e7fa0$0a23fea9@your029b8cecfe>
References: <0ae101c6ce93$8e36fac0$89849ed9@your029b8cecfe> <A2912454-4C2A-439E-8053-C247B6FDA987@cisco.com> <0f3f01c6dd9d$e30e7fa0$0a23fea9@your029b8cecfe>
Mime-Version: 1.0 (Apple Message framework v752.2)
X-Priority: 3
Content-Type: multipart/alternative; boundary="Apple-Mail-54-235550940"
Message-Id: <F39C2571-30DE-4F4E-A4A5-7A09BF2C1826@cisco.com>
Cc: ccamp@ops.ietf.org, Ross Callon <rcallon@juniper.net>, rtg-dir@cisco.com
From: JP Vasseur <jvasseur@cisco.com>
Subject: Re: Routing Directorate comments on draft-ietf-ccamp-automesh-01
Date: Fri, 22 Sep 2006 09:22:57 -0400
To: Adrian Farrel <adrian@olddog.co.uk>
X-Mailer: Apple Mail (2.752.2)
X-OriginalArrivalTime: 22 Sep 2006 13:23:00.0107 (UTC) FILETIME=[398C25B0:01C6DE4A]
DKIM-Signature: a=rsa-sha1; q=dns; l=52628; t=1158931387; x=1159795387; c=relaxed/simple; s=sjdkim7002; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=jvasseur@cisco.com; z=From:JP=20Vasseur=20<jvasseur@cisco.com> |Subject:Re=3A=20Routing=20Directorate=20comments=20on=20draft-ietf-ccamp-automes h-01; X=v=3Dcisco.com=3B=20h=3DuG5NUr9tkIvBpzVKn0luCin5Z/4=3D; b=R0Nuf+Nxd1FrQlOyHNrimk3GgvgHddaO5u1NaqsYVjc9C+I/Deam5Qn4FEnJSTG+xm3mDWuH smO8mu+qjv5zdok+Ptn0doCc1Cn4o6LN2WluXno2gO1VLlCAi/Rsn2qQ;
Authentication-Results: sj-dkim-7.cisco.com; header.From=jvasseur@cisco.com; dkim=pass ( 29 extraneous bytes; sig from cisco.com verified; );
Sender: owner-ccamp@ops.ietf.org
Precedence: bulk
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 97901e96c4dacf0263335ebc3a004b57

Hi Adrian,

On Sep 21, 2006, at 12:48 PM, Adrian Farrel wrote:

> Hi JP,
>
> Thanks for addressing the comments. I have forwarded these to the  
> Routing Directorate and copied them on this email to let them  
> respond if they want.

OK

> But here are my comments:
>

in line,

>>> 1) The Tail-end name field facilitates LSP identification. Is this
>>> a new form of LSP identification?
>>> If it is not new, then there should be a reference to RFC3209 and a
>>> statement of which RFC3209 fields are mapped to this IGP field.
>>> If it is not new then there is a significant concern that a new
>>> identification is being introduced when it is not needed.
>>
>> As indicated in the document the string refers to a "Tail-end" name,
>> not an TE LSP name: thus it does not replace the session name of the
>> SESSION-ATTRIBUTE object defined in RFC3209.
>
> Hmmm, yes it is not an LSP name, but recall that the LSP is  
> identified by a combination of Session and Sender Template, and  
> that the Session includes the destination IP address. In Section  
> 3.2 I see:
>   - A Tail-end name: string used to ease the TE-LSP naming.
> and in Section 4.1:
>   - A Tail-end name: a variable length field used to facilitate the TE
>   LSP identification.

Ah I see your point now. Just bad wording, it should have read "
The idea is not to use that field for LSP identification per say but  
to ease the management, troubleshooting, ...
For example, an implementation could form the session name based on  
the strings on these fields.

A Tail-end name: name of the Tail-end LSR.

+ see below

>
> These definitions seem to imply that the tail-end name is used as  
> an identifier for the LSP. The question that will be asked is: How  
> does this identification of an LSP differ from the conventional  
> identification of the LSP?  Given that you also have:
>   - A Tail-end address: an IPv4 or IPv6 IP address to be used as a
>   tail-end TE LSP address by other LSRs belonging to the same mesh-
>   group
> it appears that the tail-name is superfluous information.
>

Well having the name definitely helps for management,  
troubleshooting, ...

> So, perhaps the name is present for diagnostic purposes? Perhaps it  
> is there to ease OAM? But it does not seem to play any role in the  
> protocol procedures as it is not explicitly mentioned later in the  
> I-D (e.g. Section 5).
>

OK let me try to clarify adding the following text then:

The aim of the Tail-end address field is to provide a way to quickly  
identify the tail-end LSR originating the TE-MESH-GROUP and could be  
used for various purposes such such troubleshooting, management and  
so on. It does not interfere by any mean with the TE LSP attributes  
used to identify a TE LSP.

Does that clarify ?

> How would a node behave if it received a mesh group advertisement  
> that indicated a tail-end address that did not appear to match its  
> record of the tail-end name?
>
>>> 2) The document mentions that the number of mesh groups is limited
>>> but potentially (depending on encoding) provides for binary
>>> encoding for 2^32-1 groups (although this might be constrained by
>>> OSPF's limit of a TLV size to 2^16 bytes.
>>> The document (and the authors) state that scaling of these
>>> extensions is not an issue because only a small number of mesh
>>> groups are likely to be in existence in a network, and any one
>>> router is unlikely to participate in more than a very few.
>>> There are two concerns:
>>> a) Whenever we say that something in the Internet is limited,
>>> history usually proves us wrong.
>>
>> And that's undoubtedly a good news :-)
>>
>>> Indeed, there is already a
>>> proposal (draft-leroux-mpls-p2mp-te-autoleaf-01.txt) that uses a
>>> similar mechanism for a problem that would have far more groups.
>>
>> Two comments:
>> - Mesh groups are used to set up TE LSP meshes. If we consider let
>> say 10 meshes comprising 100 routers each, that gives us 99,000 TE
>> LSPs. One can easily see that the number of meshes is unlikely to
>> explode in a foreseeable future. If it turns out to be the case,
>> we'll have other scalability issues to fix before any potential with
>> the IGP.
>
> What about 100 meshes comprising 10 routers each?

Note that would still be very reasonable ;-)

> I make that only 9,000 TE LSPs.
>
> So clearly the scaling of MPLS-TE is not directly related to the  
> scaling of automesh.
>

Well you see my point ... since these extensions are used to set up  
TE LSPs, in the vast majority of the realistic cases you'll end up  
with scalability concerns with RSVP before seeing any IGP scalability  
issue.

> What this comes down to is your statement about how automesh will  
> be used. I think we can all accept that this is the problem space  
> that you intend to deploy in, and that is great. But the original  
> point from the Routing Directorate was that there is nothing in the  
> I-D that imposes this restriction. So how can we say that the  
> protocol extensions will scale?

And that is true with pretty much every protocol: one could always  
come up with a scenario where a bad usage of the protocol or a broken  
implementation may be a concern. Anyway, let me try to propose some  
text to close on this:

OLD:

It is expected that the number of mesh-groups be very limited (to at  
most 10 or so).
Moreover, TE mesh-group membership should not change frequently: each  
time an LSR joins or leaves a new TE mesh-group.

NEW:
The aim of the IGP extensions proposed in this document is to ease  
the provisioning of TE meshes, the number of which is generally very  
limited (10 at most or so), and should stay of this order of  
magnitude at least in a foreseeable future. Furthermore, such TE  
meshes are not expected to change frequently and thus the TE mesh- 
group membership is likely to be very stable (each time an LSR joins  
or leaves a new TE mesh-group, which is a not a frequent). An  
implementation SHOULD support mechanisms to control the frequency at  
which an LSR joins/leaves a particular a TE mesh group.

Does that address your concern ?

>
>> - More importantly, the dynamics of joining a TE mesh is such that
>> IGP updates are used to advertise to TE mesh group membership change
>> (join or prune), which are indeed expected to be very unfrequent.
>
> Again, the concern raised is that the problem space you intend to  
> deploy in is, indeed, limited in this way. All good. But how can we  
> say whether the protocol extensions will be used differently in the  
> future? What controls are there over constructing a mesh where  
> joins and prunes are frequent?
>
>>> b) The I-D does not itself impose any reasonable limits on the
>>> number of groups with the potential for a single router (by
>>> misconfiguration, design, or malice) advertising a very large
>>> number of groups.
>>> Thus, it appears that the scaling concerns are not properly
>>> addressed in this I-D.
>>
>> Not sure to see the point here. If indeed, a large number of TE MESH
>> GROUPs were advertised, this would not impact the other LSRs since
>> they would not create any new TE LSPs trying to join the new TE-MESH-
>> GROUP. In term of amount of flooded information, this should not be a
>> concern either (handled by routing). We clarified this in the
>> security section.
>
> The impact on the other LSRs is exactly flooding question. Covering  
> that in the security section is fine for the misconfiguration and  
> malice cases.
>
>>> 3) The document mentions that "The TE-MESH-GROUP TLV is OPTIONAL
>>> and must at most appear once in a OSPF Router Information LSA or
>>> ISIS Router Capability TLV." but for addition/removal it mentions
>>> "conversely, if the LSR leaves a mesh-group the corresponding entry
>>> will be removed from the TE-MESH-GROUP TLV."
>>> What are these "entries" referring to - that there is a top-level
>>> TE-MESH-GROUP TLV with multiple sub-TLVs (but the document mentions
>>> "No sub-TLV is currently defined for the TE-mesh-group TLV") ?
>>>
>>> AF>> My comment on this is that the definition of the TLVs seems
>>> AF>> unclear.
>>> AF>> From figure 2, it appears that some additional information  
>>> can be
>>> AF>> present in the TLV after the fields listed, and (reading
>>> AF>> between the lines) it would appear that this additional
>>> AF>> information is a series of repeats of the set of fields to
>>> AF>> define multiple mesh groups.
>>> AF>> This could usefully be clarified considerably.
>>
>>
>> You're absolutely right. The figures have been modified:
>>
>> (example show below):
>
> [SNIP]
> Looks good to me.
>
>>> AF>> But it is now unclear to me whether a single router can be a
>>> AF>> member of IPv4 an IPv6 mesh groups. It would seem that
>>> AF>> these cannot be mixed within a single TLV, and multiple
>>> AF>> TLVs (one IPv4 and one IPv6) are prohibited.
>>
>> OK the text requires some clarification. What is prohibited is to
>> have two IPv4 sub-TLV or two IPv6 sub-TLV but one of each is
>> permitted. New proposed text to clarify:
>>
>> The TE-MESH-GROUP TLV is OPTIONAL and at most one IPv4 instance and
>> one IPv6 instance MUST appear in a OSPF Router Information LSA or
>> ISIS Router Capability TLV. If the OSPF TE-MESH-GROUP TLV (IPv4 or
>> IPv6) occurs more than once within the OSPF Router Information LSA,
>> only the first instance is processed, subsequent TLV(s) will be
>> silently ignored. Similarly, If the ISIS TE-MESH-GROUP sub-TLV (IPv4
>> or IPv6) occurs more than once within the ISIS Router capability TLV,
>> only the first instance is processed, subsequent TLV(s) will be
>> silently ignored.
>
> OK. That's fine.
> I think you want to make a couple of changes:
> - "at most one instance MUST appear" is ambiguous since it will
>  be confused with "an instance MUST appear". I suggest you
>  reword as "MUST NOT include more than one of each of"
> - "If the OSPF TE-MESH-GROUP TLV (IPv4 or IPv6) occurs
>  more than once" should really be phrased as "If the either the
>  IPv4 or IPv6 OSPF TE-MESH-GROUP TLV occurs more
>  than once".  Ditto for the IS-IS sub-TLV.
> - Two instances of "will be silently ignored" should read "SHOULD
>  be silently ignored"

Fixed, thanks !

>
>>> 4) Small terminology issue in section 5.1 it says: "Note that both
>>> operations can be performed in the context of a single refresh."
>>> This is not a refresh. It is a trigger/update. A better term for
>>> OSPF would be "LSA origination".
>>
>> OK fixed (I used the term "Update"), thanks.
>
> OK
>
>>> 5) Please state the applicability to OSPF v2 and or v3. Note that
>>> the Router_Cap document covers both v2 and v3
>>
>> Indeed, Thanks for the comments.  The OSPFv3 aspects have been
>> incorporated. Here is the new text:
>
> [SNIP]
> OK
>
>>> 6) The term "fairly static" at the end of section 5.1 is
>>> meaningless without some relative context.
>>> Presumably this relates to the number times an LSR joins or leaves
>>> a mesh group over time.
>>> Is it intended to be relative to the IGP refresh period?
>>> Please clarify in an objective rather than a subjective way.
>>
>>
>> Right, this requires clarification. Here is the new text: Moreover,
>> TE mesh-group membership should not change frequently: each time an
>> LSR joins or leaves a new TE mesh-group.
>
> I could live with this, personally. We'll see whether we get any  
> more comments.
> I think the nub will be:
> 1. whether your "should not" can be "SHOULD NOT"
> 2. what does "frequently mean"?
> 3. what is there in this I-D to say that an LSR does not join/leave a
>   TE mesh-group very often?
>

Hopefully I clarified with the text above.

>> I guess that this is sufficiently explicit: it is a well-known fact
>> that LSRs are infrequently added or removed to a TE mesh.
>
> :-) Very well known. In fact, my mother was commenting on it to me  
> only the other day ;-)
>

ah so she should talk to my kids then ... we can work this out ;-)))

> Consider the case where PE membership of an automesh is dependent  
> on whether there are C-nodes subscribed to some service.
>
> Perhaps this well known fact could be noted in the Introduction to  
> this I-D which is AFAIK the only IETF document on the subject of  
> automesh.

OK, see the proposed text, and let me know what you think, I do think  
that this is sufficient but let me know.

>
>>> 7) The security section (section 8) is inadequate and will
>>> undoubtedly be rejected by the security ADs. At the very least, the
>>> I-D needs a paragraph (i.e. more than one or two lines) explaining
>>> why there are no new security considerations. But what would be the
>>> impact of adding false mesh groups to a TLV? Is there anything
>>> (dangerous) that can be learned about the network by inspecting
>>> mesh group TLVs?
>>
>> The following section has been added:
>
> [SNIP]
> OK. Let's run with that and see how much we get beaten up by the  
> Security experts.

OK, thanks.

cheers,

JP.

>
> Cheers,
> Adrian