Re: [mpls] I-D Action: draft-ietf-mpls-egress-protection-framework-02.txt

James Bensley <jwbensley@gmail.com> Mon, 23 July 2018 20:53 UTC

Return-Path: <jwbensley@gmail.com>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 23DE2130E2D; Mon, 23 Jul 2018 13:53:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id L78XajdNj92b; Mon, 23 Jul 2018 13:53:24 -0700 (PDT)
Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4D18B130F21; Mon, 23 Jul 2018 13:53:24 -0700 (PDT)
Received: by mail-lf1-x129.google.com with SMTP id v22-v6so1442644lfe.8; Mon, 23 Jul 2018 13:53:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=xtKFo6rj3HPV7YbcmziAJmHzbQmRpj4in76R8zqrlQM=; b=joMQ6asAG/CEwxxMYdeJlAMNC01Wu8dYuL3p+Kf2I24tCxCNrM9fxo4gpqSUwEeVCQ sTEes1uKGqKVEcbc6QW7K5DXph7ADjxFMMU0heCWe5GCiEKJWb8BaPEr4iehH49tiUDi 8CJICARoiP5JezFGEZrcmGQ7/FGUGygeoy0pL5MV3VkdKrl+IxpTsfGVIIdgf0KDwoEm rnV/rHOUg7Ylla6ARzCx7L6ahVM3VP0kWiCBuh0LC/OnBejvtZstrsnhBVJaUNzJaSV1 JOGMztsLlUoFondXFUsV8mNDLSfTOz+aJttc4C4nXRBMxNI8ztmoQ1oE2MQk6BIYFAYD A4tQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=xtKFo6rj3HPV7YbcmziAJmHzbQmRpj4in76R8zqrlQM=; b=aa/3c5rtb0YqRCoeqQNtI+FrEGFAkTv83IFfxkO6ez6j4B8IiJ1+U7q28WDgD5BbxB oS5QE3MQJet/V+BEPYDyPWCQuTgBylQ1k7iWLsqDNvXL9TaxnaTdzzBPqcWzS4c+21HQ R7kpLF0ilpbUhoY2Dm6pbzTygbY5WCoq4g39I+lVFy32hcAxglENcX4JGRqzqpq/k5tG k41IWJqiiDMtjYVqwr97lLOne/fHQw5J2s7vmlZvzQemIqqA0rHWvLd4X1rnU0MGNy8Q kzEPKBpHupcSycNIZ5vohwsqowQvxUnSTa3OPyO6vv8cKVPd2aIiXgKC2g4Mo8VpPd7M ZMbg==
X-Gm-Message-State: AOUpUlHCdvY7Z6/gp1pakXmyQ4JYqsOtgRORgZdsi7p9cuSwd5IK730e qjtQt6w5o9FhEry5+AeEiShrbNDFeEB77CDjOuaD4IumVZg=
X-Google-Smtp-Source: AAOMgpduLIGxnHQMebUwDb8mVyf3S4Ctp8hp1fc9hxpbUhtsIv1uXXBr5hZLdtLYPy5iNO8lKo/M83MpvMZwD9OcOaw=
X-Received: by 2002:a19:169f:: with SMTP id 31-v6mr8046368lfw.72.1532379201949; Mon, 23 Jul 2018 13:53:21 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a2e:3316:0:0:0:0:0 with HTTP; Mon, 23 Jul 2018 13:52:51 -0700 (PDT)
In-Reply-To: <153203662994.10669.14809491534972596358@ietfa.amsl.com>
References: <153203662994.10669.14809491534972596358@ietfa.amsl.com>
From: James Bensley <jwbensley@gmail.com>
Date: Mon, 23 Jul 2018 21:52:51 +0100
Message-ID: <CAAWx_pXvb_kmZosSYG3rLJzUcMToYH08kA-eahTFntV3xA-yrw@mail.gmail.com>
To: mpls@ietf.org
Cc: i-d-announce@ietf.org
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/mpls/jCGk2b9o51pMN1I2vvoVuI9V3Gw>
Subject: Re: [mpls] I-D Action: draft-ietf-mpls-egress-protection-framework-02.txt
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Jul 2018 20:53:28 -0000

Hi All,

I hope this is the correct place to ask these questions, I'm new to
the IETF WG processes so please guide me if you can. I've read the
draft and have some feedback/queries.

[QUERY1]
In the section 1.Introduction:
"Local repair refers to the scenario where the router upstream to an
anticipated failure (aka. PLR, i.e. point of local repair)
pre-establishes a bypass tunnel to the router downstream of the
failure (aka. MP, i.e. merge point), pre-installs the forwarding state
of the bypass tunnel in the data plane, and uses a rapid mechanism
(e.g. link layer OAM, BFD, and others) to locally detect the failure
in the data plane. When the failure occurs, the PLR reroutes traffic
through the bypass tunnel to the MP, allowing the traffic to continue
to flow to the tunnel's egress router."
...
"This framework requires that the destination (a CE or site) of a
service MUST be dual-homed or have dual paths to an MPLS network, via
two MPLS edge routers."

The first text quoted above seems generic enough that it can apply to
both an egress node or egress link failure. With egress node failure
the the requirement for a Merge Point and fast failure detection
mechanism equate to a backup egress node and/or protector node
(depending on whether centralized or co-located mode is being used)
and OAM, BFD, LoS etc. In the case of an egress link failure though,
as per the 2nd text quoted above, the CE must be dual homed to the
network and then the PLR is the egress node. The first text quoted
above indicates that the PLR must have a rapid failure detection
mechanism but it is not stipulated that the CE must also detect the
failure as quickly as the egress node.

It does say under section 6. Egress link protection:
"However, protection for ingress link failure SHOULD be provided by a
separate mechanism, and hence is out of the scope of this document."

However, there are two problems that I can see;

1. If the CE hasn't detected the PE-CE link is down as fast as the PE,
traffic from the MPLS network towards the CE will be rerouted via the
backup path (backup PE-CE link via backup egress node) but the return
traffic from CE to MPLS network will be sent over the PE-CE link
towards egress node and not the backup egress node, which means that
traffic will be black-holed until the CE realises it's primary link is
down.

2. If the CE uses uRPF it should drop the traffic coming from the
PE-CE link from the backup egress node until it detects the PE-CE link
to the egress node is down.

Both of these scenarios are based on the assumption that the PE to CE
links are in active/passive or primary/secondary depending on your
terminology - only one link is being used, the other is purely a
backup for when the first is down. However, active/passive links to
dual-homes CEs is extremely common and the fast re-route mechanism
being offered by this draft can be completely undermined because of
the two reasons I have outlined above.

To prevent the draft be undermined like this:

- Opt1. I think it is worth either mentioning the issues that can
arise if the CE doesn't detect the issue with the PE-CE link as
quickly as the PE.
- Opt2. Alternatively, I think it is worth mentioning something like
"the CE SHOULD detect the PE-CE link down as quickly as the PE" - a
statement which is fairly vague but prevents the entire
feature/technology in the draft being undermined when it is correctly
implemented.
- Opt3. Another alternative, is to suggest that "the PE-CE link SHOULD
run a fast failure detection mechanism (the exact choice is outside
the scope of this document)" and recommend the implementer use a fast
failure detection mechanism between the PE and CE as it's not clearly
called out anywhere as far as I can see, only within the MPLS core
between PE or P nodes.
- Opt4. Something else?

Further to this - should the draft make a statement about the use of
all-active or active/passive links? In the case of all active links
(e.g. ECMP) the uRPF issue no longer exists.


[QUERY2]
>From section 8.2. Egress link protection:
"When PE3 detects a failure of the egress link, it will invoke the
above bypass nexthop to reroute VPN service packets."

Is that a typo and should be PE2?


[QUERY3]
>From section 8. Example: Layer-3 VPN egress protection:
"The nexthops of these routes MUST be based on PE3's connectivity with
site 2, even if this connectivity is not the best path in PE3's VRF
due to metrics (e.g. MED, local preference, etc.), and MUST NOT use
any path traversing PE2."

PE3 is never going to advertise its (presumably) less preferred path
to site 2's prefixes towards PE2 because it is receiving more
preferred paths from PE2. This draft isn't clearly saying if this
behaviour will be overridden ONLY for MPLS VPNs that have this egress
protection mechanism explicitly configured or if it relies on the PE
already having something like "BGP Advertise Best-External" in Cisco
speak or what I think is called "Provider Edge Link Protection" in
Junos speak, already enabled. Should this be clearly stated in the
draft?


Kind regards,
James.