Re: [OSPF] TTZ, my 2c

Huaimo Chen <huaimo.chen@huawei.com> Thu, 18 July 2013 19:28 UTC

Return-Path: <huaimo.chen@huawei.com>
X-Original-To: ospf@ietfa.amsl.com
Delivered-To: ospf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5735911E81CC for <ospf@ietfa.amsl.com>; Thu, 18 Jul 2013 12:28:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.249
X-Spam-Level:
X-Spam-Status: No, score=-6.249 tagged_above=-999 required=5 tests=[AWL=-0.250, BAYES_00=-2.599, J_CHICKENPOX_46=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Qq2KHQOIO269 for <ospf@ietfa.amsl.com>; Thu, 18 Jul 2013 12:28:28 -0700 (PDT)
Received: from lhrrgout.huawei.com (lhrrgout.huawei.com [194.213.3.17]) by ietfa.amsl.com (Postfix) with ESMTP id C317A21E8096 for <ospf@ietf.org>; Thu, 18 Jul 2013 12:28:14 -0700 (PDT)
Received: from 172.18.7.190 (EHLO lhreml204-edg.china.huawei.com) ([172.18.7.190]) by lhrrg01-dlp.huawei.com (MOS 4.3.5-GA FastPath queued) with ESMTP id AVE36131; Thu, 18 Jul 2013 19:28:13 +0000 (GMT)
Received: from LHREML401-HUB.china.huawei.com (10.201.5.240) by lhreml204-edg.china.huawei.com (172.18.7.223) with Microsoft SMTP Server (TLS) id 14.1.323.7; Thu, 18 Jul 2013 20:27:30 +0100
Received: from DFWEML407-HUB.china.huawei.com (10.193.5.132) by lhreml401-hub.china.huawei.com (10.201.5.240) with Microsoft SMTP Server (TLS) id 14.1.323.7; Thu, 18 Jul 2013 20:28:11 +0100
Received: from DFWEML509-MBX.china.huawei.com ([169.254.11.169]) by dfweml407-hub.china.huawei.com ([10.193.5.132]) with mapi id 14.01.0323.007; Thu, 18 Jul 2013 12:28:08 -0700
From: Huaimo Chen <huaimo.chen@huawei.com>
To: "A. Przygienda" <prz@mail.zeta2.ch>, "ospf@ietf.org" <ospf@ietf.org>
Thread-Topic: [OSPF] TTZ, my 2c
Thread-Index: AQHOfMnSFayg2jpL40ykNrJFdj5x45lcvOQQgACZE4CABv0JQA==
Date: Thu, 18 Jul 2013 19:28:08 +0000
Message-ID: <5316A0AB3C851246A7CA5758973207D4451E41CD@dfweml509-mbx.china.huawei.com>
References: <51DC4844.7050106@zeta2.ch> <5316A0AB3C851246A7CA5758973207D4451E2F4C@dfweml509-mbx.china.huawei.com> <51DC8395.5060703@zeta2.ch>
In-Reply-To: <51DC8395.5060703@zeta2.ch>
Accept-Language: en-US, zh-CN
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.212.246.23]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Subject: Re: [OSPF] TTZ, my 2c
X-BeenThere: ospf@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: The Official IETF OSPG WG Mailing List <ospf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ospf>, <mailto:ospf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ospf>
List-Post: <mailto:ospf@ietf.org>
List-Help: <mailto:ospf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ospf>, <mailto:ospf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Jul 2013 19:28:32 -0000

Hi Tony,

    Thanks for your comments!
    My responses are inline below.

Best Regards,
Huaimo
-----Original Message-----
From: ospf-bounces@ietf.org [mailto:ospf-bounces@ietf.org] On Behalf Of A. Przygienda
Sent: Tuesday, July 09, 2013 5:42 PM
To: ospf@ietf.org
Subject: Re: [OSPF] TTZ, my 2c

Am 09.07.2013 15:53, schrieb Huaimo Chen:
> Hi Tony,
>
>      Thanks for your comments!
>      See my responses inline below.
>
> Best Regards,
> Huaimo
> -----Original Message-----
> From: ospf-bounces@ietf.org [mailto:ospf-bounces@ietf.org] On Behalf 
> Of A. Przygienda
> Sent: Tuesday, July 09, 2013 1:29 PM
> To: 'OSPF List'
> Subject: [OSPF] TTZ, my 2c
>
> Somehow the list killed my email. Another attempt. Kind of agree with 
> Hannes I guess
>
>
> So, looking at other stuff zipped over the TTZ drafts and it's an 
> intriguing idea but IMHO suffers from several heavy defects. Quick 
> incomplete list
>
>
> draft-chen-ospf-ttz-app-03.txt
>
> 1. Section 5.1 is bogus. A reasonable scalability comparison  should 
> not be between 1600 Rs flat OSPF and
>       10 TTZs but between  10 areas & 10 TTZs (which will come by my 
> gut feeling to about the same or rather
>       small difference)
>
> [Huaimo] The reason for comparing a network having one area with the one having 10 TTZ is that the latter has some attributes of one area. The network having 10 areas does not have these attributes.
pls be more specific here. Area will summarize 1/10 of the routers into ideally one prefix which is actually much smaller than a full mesh of TTZ routers (ad extremis). I think a fair comparison should address all this.
[Huaimo2] More specifically, 
1) The network having 10 TTZs is still in one area. Every node in the network still has the network topology as a whole at high level. That is that every node has a picture of the whole network at high level. Every node can see through TTZs. That is that it sees the network topology over TTZs. For a network with multiple areas, every node (except for ABR) has only the topology of its area. It can not see through any area. That is that it can not see the network topology beyond its area.
2) Transforming a network with one area into the one with 10 TTZs is smooth and easy. 
Transforming a network with one area into a network with 10 areas is very complex and may cause service interruptions. Dividing one area into multiple areas is involved in significant network architecture changes. Originally the network has only one area, which is backbone area. This original backbone area will be split into a new backbone area and a number of non backbone areas. In general, each of the non backbone areas is connected to the new backbone area through the area border routers between the non backbone area and the backbone area.  There is not any direct connection between any two non backbone areas. Thus, some connections need to be added, and some connections need to be removed. Each area border router summarizes the topology of its attached non backbone area for transmission on the backbone area, and hence to all other area border routers. During the split of the network from one area to multiple areas, some routes will be changed.
There is not any significant changes on network architecture when an OSPF TTZ is applied to a group of routers and links in the network directly. We do not need add any new connection to the network, or remove any existing connection from the network. During the deployment of TTZs in the network, the existing routes will be stable. In addition, for a group of routers and a number of links connecting the routers in an area, making them to work as a TTZ can be done automatically in a route convergence time.
3) MPLS TE LSP from a source to a destination can be set up in the same way as it is set up in one area. 

> 2. Section 5.2.2.2 is probably misleading. The example completely 
> skirts the issue how a tunnel  can be
>      computed not only through the TTZ border routers but through 
> internals of TTZ. I assume that happends
>       by the 'virtual links' and that leads to good amount of problems 
> by itself since there is the need to setup
>       labels within the TTZ, issues with link- and node-disjointness 
> in case of protections and similiar nit-picks.
>       All those issues are probably similar to the new 'segment-routing'
> drafts.
>
> [Huaimo] Can we focus on LSP setup first?  For LSP setup, do you see any issue in this section? For LSP protections, we can propose some solutions later.
Yes, an example of LSP setup including label distribution (without /32s ??? in TTZ )  would be nice.
[Huaimo2] The following is an example of setting up MPLS TE LSP crossing a TTZ in more details.
                   TTZ 600
                   \
                    \ ^~^~^~^~^~^~^~^~^~^~^~^~
      Source    51   (                        )
    ===[R15]========(==[R61]------------[R63]==)======[R29]===
        ||         (   |    \          /    |   )       ||
        ||         (   |     \        /     |   )       ||
        ||         (   |      \11    /      |   )       ||
        ||         (   |    ___\    /       |   )       ||
        ||         (   |   /   [R71]        |   )       ||
        ||         (   | [R73] /    \       |   )       ||
        ||         (   |      /      \      |   )       ||
        ||         (   |     /        \17   |   )       ||
        ||         (   |    /          \    |   )  71   ||
    ===[R17]========(==[R65]------------[R67]==)======[R31]===
         \\          (//                    \\)       //Destination
          ||         //v~v~v~v~v~v~v~v~v~v~v~\\      ||
          ||        //                        \\     ||
          ||       //                          \\    ||
           \\     //                            \\  //
       ======[R23]==============================[R25]=====
             //                                     \\
            //                                       \\

                       LSP from R15 to R31

On a source node, we can configure a TE LSP from the source to a destination crossing TTZs in the same way as we configure it without any TTZs.  This is because the source node is not aware of any TTZs.

   For example, on node R15 in Figure above, to set up a TE LSP from R15 to
   R31, we just configure the TE LSP by giving its source R15, its
   destination R31, and some constraints such as bandwidth as needed.

   On the source node, it computes the path to the destination based on
   the configuration of the TE LSP.  It just sees a full mess connection
   of edge nodes for every TTZ.  Thus the computation of the path is
   done in the same way as it is done without any TTZ.  After the path
   is computed, the source node starts to signal the LSP automatically
   along the path in the same way as it does without any TTZ.  

   For example, on node R15 in Figure above, it computes the path to the
   destination R31.  It sees the full mess connection of four TTZ edge
   nodes R61, R63, R65 and R67 in its topology.  It computes the path in
   the same way as before and may get the path: R15 - R61 - R67 - R31.
   And then it signals the TE LSP along this path.  It sends a RSVP-TE
   PATH message to R61.

   When R61, which is an edge node of a TTZ, receives the PATH message,
   it computes the path segment to the other edge node R67 (Supposed
   that the path segment is: R61 - R71 - R67) and continues to signal
   the TE LSP to R67 along the path segment computed.  It sends a PATH
   message to R71, which sends a PATH message to R67, which sends a PATH
   message to R31.

   When R31 receives the PATH message from R67, it allocates a label
   (e.g., 71), reserves the bandwidth as needed, and sends a RESV
   message with the label (71) to R67.  It sets the forwarding entry for
   the TE LSP using label 71 as inbound label.

   When R67 receives the RESV message from R31, it allocates a label
   (e.g., 17), and sends a RESV message with the label (17) to R71.  It
   also sets the cross connect for the TE LSP using labels 17 and 71 as
   inbound label and outbound label respectively.

   When R71 receives the RESV message with the label (17) from R67, it
   allocates a label (e.g., 11), and sends a RESV message with the label
   (11) to R61.  It sets the cross connect for the TE LSP using labels
   11 and 17 as inbound label and outbound label respectively.

   When R61 receives the RESV message with the label (11) from R71, it
   allocates a label (e.g., 51), and sends a RESV message with the label
   (51) to R15.  It sets the cross connect for the TE LSP using labels
   51 and 11 as inbound label and outbound label respectively.

   When R15 receives the RESV message with the label (51) from R61, it
   sets the forwarding entry for the TE LSP using label 51 as outbound
   label.  At this point, the set up of TE LSP from R15 to R31 is done.

>
> 3. 5.2.2.3 basically assumes that there is no reachability needed 
> within the TTZ (i.e. prefixes). Yes, then ABR
>       summaries are not needed but how do you manage something like 
> the routers in TTZ (loopback addresses of the routers) ?
>       It's one thing to intentionally hide-the-transits and another to 
> not have a possibility to reach the routers.
>
> [Huaimo] loopback address distribution will be addressed.
ok
>
> 4. I see how section 5.5 (POP) or 5.7 will work. That maybe a possible 
> application and simpler than NSSA. However,
>       customer prefixes must bubble up somehow unless they are 
> configured only on a single TTZ pointing towards \
>       core ?
>
> [Huaimo] This will be considered.
ok
>
>
> draft-chen-ospf-ttz-05.txt
>
>
> In general, all the claims to 'simpler than area' seem to me based on 
> the fact that
>
> a) all the things that an area needs are ignored here . no indication 
> what happens on partitioning of TTZs(should be simple)
>
> [Huaimo] This will be addressed.
ok
>
> . no indication what happens when TTZs find themselves in 2 different 
> areas ? Do the TTZ borders start to act like border routers ? Saying 
> 'they MUST not' is IMHO not good enough for deployable protocol, mode 
> of failure must be described & debugging procedure. Best would be e.g. 
> if TTZ borders check for Area ID on router LSAs (which would need to 
> be carried in a new opaque or something like this
> ?) and make sure TTZ is in a single
> area.
>
> [Huaimo] At first, we limited a TTZ in one area. You are right. Some checks need to be done for this.
ok
>
> . I assume that routing integrity (hop-by-hop) is guaranteed, by 
> external routers computing 'routes using the "virtual TTZ links"'. 
> This IMHO will be a major scalability problem (and routing loops) once 
> the TTZs grow since the number of virtual links grows about N^2/2 of 
> edges. This will quickly lead to TONS of virtual links which will need 
> to be distributed & lead to transit routing loops since those 'virtual 
> links' are topology summaries and will not reflect the 'real 
> topology', i.e to make sure you don't have routing loops the TTZ 
> border routers can only use the tunnels when they have been 
> redistribugted to everyone i.e. after everyone has the same topology 
> (how you do that?).
> Otherwise they must use the 'previous' topology before the 
> summarization into virtualk links. Yes, those are transient loops but 
> having 10 TTZ borders will generate already
> 50 tunnel summaries and they take time to distribute.
>
> [Huaimo] The number of edge nodes of a TTZ should be small.
_should_ be small is not sufficient normally for a working protocol in the wild. I fail to see how you want to guarantee that if someone simply connects a router in the area to a TTZ router in the same area. They will form an adjacency by default, right ? So you don't even have something like area-id protection to not form adjancencies on misconfigurations (as in 'tons-of-TTZ' routers accidental misconfig)
[Huaimo2] Some check will be done.

>
> . no indication how internal reachability within TTZ is guaranteed & 
> summarized and how it is guaranteed that e.g.
> prefixes are not being reimported as type 5 or 7 into the TTZ again 
> through backdoor mechanisms (e.g. TTz borders in two different areas) 
> in case TTZ ends up summarizing.
>
> [Huaimo] Can you give more details about this?
well, if you start to slosh 160 /32's out the area, pretty soon you wil be tempted to summarize as well (as in OSPF standard area).  Then you run into type  5-7 problems (as OSPF area summaries do) and you'll have to re-invent or re-use all those mechanisms again.
[Huaimo2] It seems that this should be handled by ABRs since a TTZ is in one area. 

>
> b) I don't understand 8.3. What is "a TTZ is virtualized as a group of 
> edge routers of the TTZ connected". Section 7 implies that TTZ edges 
> are advertised as full mesh of 'virtualized links'. Contradiction ?
>
> [Huaimo] Full mesh connections of the edges is one type of connections of the edges. The above sentence indicates that a TTZ may have a type of connections including full mesh.
Then this will become nothing else but 'abstract node representation' in PNNI you're re-inventing here then (hub-spokes, weighted centers of graphs, tangentail shortcuts & all the other animals of the zoo). This was a famour rathole I suggest to revisit despite ATM not being the hottest technology (anymore ;-))
>
> So, my gut feeling is after everything has been solved this will be 
> more complex or equal to areas in terms of configuration, behavior, 
> contraints for a questionable gain of having a full-mesh-TTZ construct 
> versus a prefix-summary ABR construct.
>
> [Huaimo] Configuring a TTZ will be simpler than configuring an area. We are working on it.
> The behavior should also be simpler.
as I said and it's original jest of my mail, after you're done with all the problems, I would be highly surprised if this proves the case.  But that just my personal rusty 2c

Sorry to be tough but tough love is best love for internet drafts often '-)
[Huaimo2] Questions and discussions help to improve the draft a lot. 

-- tony



_______________________________________________
OSPF mailing list
OSPF@ietf.org
https://www.ietf.org/mailman/listinfo/ospf