Re: [mpls] [PWE3] WG Last Call for draft-ietf-pwe3-endpoint-fast-protection-01 - RFC4447

Alexander Vainshtein <Alexander.Vainshtein@ecitele.com> Thu, 07 August 2014 13:52 UTC

Return-Path: <Alexander.Vainshtein@ecitele.com>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D19F61B2B32; Thu, 7 Aug 2014 06:52:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.002
X-Spam-Level:
X-Spam-Status: No, score=-2.002 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_35=0.6, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id e53Ajc-MODFj; Thu, 7 Aug 2014 06:52:30 -0700 (PDT)
Received: from emea01-am1-obe.outbound.protection.outlook.com (mail-am1lrp0016.outbound.protection.outlook.com [213.199.154.16]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6584F1B2B1D; Thu, 7 Aug 2014 06:52:29 -0700 (PDT)
Received: from AM3PR03MB612.eurprd03.prod.outlook.com (10.242.110.144) by AM3PR03MB610.eurprd03.prod.outlook.com (10.242.109.27) with Microsoft SMTP Server (TLS) id 15.0.995.14; Thu, 7 Aug 2014 13:52:27 +0000
Received: from AM3PR03MB612.eurprd03.prod.outlook.com ([10.242.110.144]) by AM3PR03MB612.eurprd03.prod.outlook.com ([10.242.110.144]) with mapi id 15.00.1005.008; Thu, 7 Aug 2014 13:52:27 +0000
From: Alexander Vainshtein <Alexander.Vainshtein@ecitele.com>
To: "erosen@cisco.com" <erosen@cisco.com>
Thread-Topic: [mpls] [PWE3] WG Last Call for draft-ietf-pwe3-endpoint-fast-protection-01 - RFC4447
Thread-Index: AQHPsY6z5lHYoEzJm0ODJoEdFDIfx5vFBHcg
Date: Thu, 07 Aug 2014 13:52:26 +0000
Message-ID: <3be1532ff6644212b98eed535fccce9b@AM3PR03MB612.eurprd03.prod.outlook.com>
References: Your message of Tue, 05 Aug 2014 13:59:49 -0000. <9696d0db139d46ffaad7be11340215e8@AM3PR03MB612.eurprd03.prod.outlook.com> <16167.1407340459@erosen-lnx>
In-Reply-To: <16167.1407340459@erosen-lnx>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [147.234.56.21]
x-microsoft-antispam: BCL:0;PCL:0;RULEID:
x-forefront-prvs: 029651C7A1
x-forefront-antispam-report: SFV:NSPM; SFS:(6009001)(37854004)(13464003)(189002)(199002)(377454003)(252514010)(51444003)(76482001)(81342001)(2351001)(74502001)(83322001)(66066001)(4396001)(21056001)(77982001)(74316001)(33646002)(2656002)(87936001)(101416001)(74662001)(106356001)(110136001)(81542001)(86362001)(80022001)(76576001)(106116001)(83072002)(76176999)(46102001)(19580405001)(19580395003)(107046002)(20776003)(64706001)(85306004)(95666004)(99396002)(54356999)(50986999)(105586002)(85852003)(2501001)(24736002)(108616003); DIR:OUT; SFP:; SCL:1; SRVR:AM3PR03MB610; H:AM3PR03MB612.eurprd03.prod.outlook.com; FPR:; MLV:sfv; PTR:InfoNoRecords; MX:1; LANG:en;
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: ecitele.com
Archived-At: http://mailarchive.ietf.org/arch/msg/mpls/rsPYlN9Gy5dhsN92-Rl2KN04xO4
Cc: "mpls@ietf.org" <mpls@ietf.org>, pwe3 <pwe3@ietf.org>, "pwe3-chairs@tools.ietf.org" <pwe3-chairs@tools.ietf.org>
Subject: Re: [mpls] [PWE3] WG Last Call for draft-ietf-pwe3-endpoint-fast-protection-01 - RFC4447
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Aug 2014 13:52:34 -0000

Eric,
Lots of thanks for a very detailed and clear response. It is by far more readable (from my point of view) than the original text in the draft. It has really helped me to understand the intention of the authors and some new potential pitfalls with this draft.

I agree with most of your comments about fundamental differences between protection against the failure of the Primary PE and protection against the "egress AC failure".  

I have some LDP-related questions that stem directly from your responses (I have added Loa to this thread). You have explained (and the draft also says that!) that the "context identifier" IP addresses must be known to both the primary PE and the Protector, LDP assigning labels for them and distributing corresponding Label Mappings.

1. I assume that this means that both the Primary PE and the Protector "own" this IP address, and, as a consequence, include it in the LDP Address messages they send to their respective adjacencies. Is this correct? 
2. If the answer to the previous question is "Yes", what is supposed to happen if the Primary PE and the Protector have a common adjacent P router?
    (This would be the case shown in Figure 5 in the draft if P3 were physically connected both to PE2 and PE4).
     Such a router would see the same IP address in the LDP Address messages received from two different adjacencies.
     Can LDP handle such a situation? Is it covered by RFC 5036?
3. If the answer for to my first question is "No", what would cause LDP in the Primary PE and the Protector to allocate a terminated label (i.e., 
     a label with action "Pop" in the ILM) to the FEC represented by the context identifier IP address? And how would the PLR understand that it is the penultimate hop towards the Primary PE?
     
I also have some doubts regarding ability to tweak the metrics in the network in such a way that, as you've explained:
-  As long as the Primary PE is up, the PLR sees it as the best Next Hop for the "context identifier" IP address
- Once the Primary PE is Down, the PLR sees the Protector as best Next Hop for the "context identifier" IP address.

These doubts are based on the following concerns:

1. Potential impact of ECMP. 
    ===================
    Suppose that ECMP is employed in the network, and that, as a consequence, there are multiple penultimate Next Hops 
    On multiple equal cost LSPs between PE1 and the Primary PE2. Is it possible by just playing with the metrics to guarantee that:
    - As long as the Primary PE2 is Up, each potential PLR would see it as the best next Hop towards the "context identifier"
    - Once PE2 is Down, each potential PLR (which would only know that its own link to the Primary PE has failed ) would treat the Protector as its best Next Hop
     towards the "context identifier"?

2. Potential impact of changes in the network topology. 
    ========================================
    Suppose that the metrics have been initially tweaked as required. Then some links between the
    PLR and the Protector fail so that the cost of the available path between them increases. Now, when the Primary PE goes Down, the PLR only sees that
    Its link to the Primary PE goes down, but the cost of some alternative path towards the "context identifier" could now still lead to the Primary PE. 

Due to these concerns, my personal bottom line is that operating the proposed mechanism over an MPLS network with LSPs instantiated using LDP in DU mode is sometimes possible but always tricky.  

Regards, and,again, lots of thanks,
       Sasha 

Email: Alexander.Vainshtein@ecitele.com
Mobile: 054-9266302

> -----Original Message-----
> From: Eric Rosen [mailto:erosen@cisco.com]
> Sent: Wednesday, August 06, 2014 6:54 PM
> To: Alexander Vainshtein
> Cc: stbryant@cisco.com; mpls@ietf.org; pwe3; pwe3-chairs@tools.ietf.org
> Subject: Re: [mpls] [PWE3] WG Last Call for draft-ietf-pwe3-endpoint-fast-
> protection-01 - RFC4447
> 
> I've looked at this draft, and it seems to me that it does work in LDP-based
> networks (even with DU mode), and that it does not violate any architectural
> principles.  However, I have made certain assumptions that aren't explicitly
> stated in the draft.  (I am assuming that the draft is intended to be patterned
> after the "anycast BGP" scheme that I've heard about about, but that is not
> written up anywhere as far as I know.)  Let me say how I think it is intended
> to work, and the authors can correct me if I'm wrong.
> 
> For protection against the failure of the "primary" egress PE:
> 
> - A "primary egress PE" partitions its set of PWs into n sets, where each
>   set is associated with a given "protector".  The "protector" is the backup
>   egress PE for those PWs.
> 
> - Each such set is associated with an IP address.  This IP address must be
>   known both to the primary PE and to the protector.
> 
>   * One can think of this address as an "anycast" address that, to routing,
>     denotes both the primary PE and the protector.  At the primary PE and
>     the protector, the address denotes a particular set of PWs.
> 
>   * These anycast addresses are distributed by routing, and good old
>     fashioned MPLS labels get assigned to them by LDP.  The primary PE MAY
>     bind implicit null to that address, but the protector MUST NOT bind
>     implicit null to that address.
> 
>   * Routing is set up so that (a) if the primary PE is up, traffic to the
>     anycast address always goes to the primary PE, but (b) if the primary PE
>     is down, traffic to the anycast address goes to the protector instead.
>     This can be achieved by messing with the metrics, e.g.
> 
> - For a given PW, the primary PE informs both the protector and the ingress
>   PE  of (a) the label that it (the primary PE) has assigned to that PW, and
>   the "anycast address" that corresponds to that PW.
> 
>   This requires some new signaling (beyond that specified in RFC 4447), but
>   the requisite signaling is specified in the draft.
> 
> - When sending a packet on the given PW, the ingress PE pushes on the label
>   assigned to the PW by the egress PE.  Then the ingress PE pushes on
>   whatever LDP-assigned label it need to use to get the packet to the
>   anycast address that is associated with the PW.
> 
> - When the protector learns that the egress PE has assigned a given label
>   and anycast address to the PW, it stores that label in a context-specific
>   label space associated with that anycast address.  Whenever the protector
>   receives a packet whose top label is the label that the protector has
>   bound to the given anycast address, the protector looks up the next label
>   in the corresponding label space.
> 
> Let's look at an example:
> 
> CE1---PE1----PLR----PE2----CE2
>                |             |
>                |-----PE3-----|
> 
> where PE3 is the protector for a PW connecting CE1 to CE2 (via PE1 and PE2
> respectively).   Call this pseudowire PW1.
> 
> Thus there is some IP address, call it A, that PE2 and PE3 both advertise to
> routing.  The metrics are set up so that as long as PE2 is up, it appears to be a
> better path to A than does PE3.
> 
> So PE2 assigns label L1 to PW1, and uses T-LDP to inform both PE1 and PE3 of
> this mapping.  To PE1, these are just outgoing labels.  To PE3, however, they
> are incoming labels in a context-specific label space.
> 
> PE2 and PE3 both assign labels to A (let's call these labels L2 and L3
> respectively), and distribute the corresponding label bindings to PLR.  PLR
> also assigns a label to A, and distributes that label binding (L4) to PE1.
> 
> When PE1 has a packet to send on PW1, it pushes L1, then pushes L4, then
> sends the packet to PLR.  PLR swaps L4 with L2 and sends the packet to PE2.
> (Note that L2 could be implicit NULL.)  PE2 sees label L1, and sends the packet
> to CE2.
> 
> Now if PE2 goes down, things happen a bit differently.  When PLR gets a
> packet with L4 on top, it swaps L4 with L3, and sends the packet to PE3.  (In
> this case, of course, L3 cannot be implicit NULL.)
> 
> PE3 gets the packet, sees L3 on top, looks it up, and then says "L3 is a context
> label, so I have to pop it and then look up the following label in the L3-specific
> label table."  This context-specific label table contains L1, which has been
> bound to the PW that leads to CE2.  So PE2 pops the label and sends the
> packet to CE2.
> 
> I think that's basically it, and the necessary mechanisms do seem to be
> described in the draft.  The draft doesn't actually use the notion of "anycast
> address" explicitly, but I think the intention is that the "context labels" are
> definitely bound to anycast addresses.  I hope the authors will correct me if
> I'm wrong.
> 
> Although the above example shows a PLR that is a neighbor of both PE2 and
> PE3, this isn't really necessary, one could easily construct an example where
> PE2 and PE3 have no neighbors in common.
> 
> The example above doesn't rely on LDP DoD at all, and doesn't make use of
> any RSVP-TE backup tunnels.
> 
> However, when protecting against the failure of the egress AC, packets
> would go first to PE2 then to PE3.  This scenario is a bit different, in that:
> 
> - I think this scenario may require an RSVP-TE backup tunnel (on which PHP
>   is not used) from PE2 to PE3.
> 
> - PE2 would have to signal PE3 to associate the given PW with the given
>   backup tunnel.
> 
> - The label that PE3 assigns to that backup tunnel would identify the
>   context in which the PW label is looked up.
> 
> This is perhaps more detail than is actually in the draft ;-) but if it is an
> accurate description of the authors' intentions, the scheme seems to work.
> 
> I don't think this modifies the MPLS architecture at all.  It is true that the PE3
> needs to maintain a context-specific label table that contains labels assigned
> by PE2.  While one may say that that is a "coordinated label space", it is no
> more than what is required by the nature of "upstream-assigned labels".
> I.e., if you can't support this level of coordination, you just can't support
> upstream-assigned labels.
> 
> Also, I don't think this draft is an "update" to RFC 4447.  You don't need to
> read this draft to implement RFC 4447, nor does it modify the procedures of
> RFC 4447.  It just introduces some new optional signaling and procedures.
> I don't think RFC 4447's requirement to use the "platform label space"
> should be interpreted as prohibiting the use of upstream-assigned labels.
> The intention of RFC 4447 was to prohibit the use of the interface-specific
> label space for PW labels.  This prohibition was necessary because the ingress
> PE doesn't know which of the egress PE's network-facing interfaces will
> actually receive a given packet.  This draft doesn't have that problem, as it
> provides clear procedures for determining the context in which the
> upstream-assigned labels are looked up.
> 
>