Re: [bess] Shepherd's review of draft-ietf-bess-nsh-bgp-control-plane-06

<stephane.litkowski@orange.com> Thu, 06 June 2019 11:45 UTC

Return-Path: <stephane.litkowski@orange.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 41F241200CC; Thu, 6 Jun 2019 04:45:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aIq3t2Nv-SGd; Thu, 6 Jun 2019 04:45:25 -0700 (PDT)
Received: from orange.com (mta135.mail.business.static.orange.com [80.12.70.35]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1E4CF1200B5; Thu, 6 Jun 2019 04:45:25 -0700 (PDT)
Received: from opfednr05.francetelecom.fr (unknown [xx.xx.xx.69]) by opfednr25.francetelecom.fr (ESMTP service) with ESMTP id 45KP3l13RZzCsgD; Thu, 6 Jun 2019 13:45:23 +0200 (CEST)
Received: from Exchangemail-eme6.itn.ftgroup (unknown [xx.xx.13.51]) by opfednr05.francetelecom.fr (ESMTP service) with ESMTP id 45KP3l0749zyQB; Thu, 6 Jun 2019 13:45:23 +0200 (CEST)
Received: from OPEXCAUBMA3.corporate.adroot.infra.ftgroup ([fe80::90fe:7dc1:fb15:a02b]) by OPEXCAUBM22.corporate.adroot.infra.ftgroup ([fe80::954c:232a:f07d:25af%21]) with mapi id 14.03.0439.000; Thu, 6 Jun 2019 13:45:22 +0200
From: stephane.litkowski@orange.com
To: "adrian@olddog.co.uk" <adrian@olddog.co.uk>, "draft-ietf-bess-nsh-bgp-control-plane@ietf.org" <draft-ietf-bess-nsh-bgp-control-plane@ietf.org>
CC: "bess@ietf.org" <bess@ietf.org>
Thread-Topic: Shepherd's review of draft-ietf-bess-nsh-bgp-control-plane-06
Thread-Index: AdTN15CohDC0LG//QgG1ys5nLdIbiQCg4JYAAL9ELHAAM2ImgAAnAT8wCXyDt4AA62IVAAYe+OYAAV/EO4A=
Date: Thu, 06 Jun 2019 11:45:22 +0000
Message-ID: <24734_1559821523_5CF8FCD3_24734_7_1_9E32478DFA9976438E7A22F69B08FF924C249E6F@OPEXCAUBMA3.corporate.adroot.infra.ftgroup>
References: <6687_1551262912_5C7664C0_6687_242_18_9E32478DFA9976438E7A22F69B08FF924C199D40@OPEXCAUBMA3.corporate.adroot.infra.ftgroup> <090901d4d063$75aa6cf0$60ff46d0$@olddog.co.uk> <30790_1551796864_5C7E8A80_30790_14_1_9E32478DFA9976438E7A22F69B08FF924C19B882@OPEXCAUBMA3.corporate.adroot.infra.ftgroup> <036f01d4d42e$0ec01340$2c4039c0$@olddog.co.uk> <2409_1551949080_5C80DD18_2409_404_24_9E32478DFA9976438E7A22F69B08FF924C1A4AF2@OPEXCAUBMA3.corporate.adroot.infra.ftgroup> <070e01d4fac4$8478a4a0$8d69ede0$@olddog.co.uk> <6812_1556525855_5CC6B31F_6812_378_8_9E32478DFA9976438E7A22F69B08FF924C1F0D2D@OPEXCAUBMA3.corporate.adroot.infra.ftgroup> <038b01d516ed$f0a0fc00$d1e2f400$@olddog.co.uk>
In-Reply-To: <038b01d516ed$f0a0fc00$d1e2f400$@olddog.co.uk>
Accept-Language: fr-FR, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.114.13.247]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/jk7RjU0vhktU5p1GGVAKy27BFbM>
Subject: Re: [bess] Shepherd's review of draft-ietf-bess-nsh-bgp-control-plane-06
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Jun 2019 11:45:28 -0000

Hi Adrian,

I'm not comfortable with the time-out of controlplane informations.
How do you handle a situation where there is an unknown SFIR-RD in a hop TLV for a valid reason (the SF is down for a while !), so you are timing out the SFPR, and eventually the SF is restored and comes back online ? You should, in this case, readvertise the SFPR from the source. I think overloading could be managed as usual by limiting the number of routes that a device could import (per VRF context or globally). If there is unnecessary controlplane information, that's the role of the operator to do the housekeeping, not the role of the protocol/implementation.

For the flowspec part, I'm fine, but we need to have the agreement from IDR guys who master the topic.

Brgds,

Stephane


-----Original Message-----
From: Adrian Farrel [mailto:adrian@olddog.co.uk] 
Sent: Thursday, May 30, 2019 15:45
To: LITKOWSKI Stephane OBS/OINIS; draft-ietf-bess-nsh-bgp-control-plane@ietf.org
Cc: bess@ietf.org
Subject: RE: Shepherd's review of draft-ietf-bess-nsh-bgp-control-plane-06

Hi Stephane,

Thanks again for the thoroughness of your review and the time it has taken to herd the necessary cats.

* BGP ERROR HANDLING:
>>>> I don’t see the “error handling” behavior associated with this attribute
>>>> (discard, treat-as-withdraw…)
>>>
>>> I think the errors are covered by section 6 of RFC 4271, but we need to
>>> point to it.
>>>
>>> [SLI] You have added " Malformed SFP attributes, or those that in error in
>>> some way, MUST be handled as described in Section 6 of [RFC4271]"
>>> This is not enough ad RFC7606 allows for a more "graceful" process of 
>>> errors and it's up to each new attribute to have its own behavior in term
>>> of error processing. RFC7606 has some guidelines.
>>
>> This one will take a little more time to work up some text.
>> We'll get back to you.
>>
>> [SLI2] This is an important thing to address, and the IESG or Routing Directorate
>> may catch that as well.
>
> OK, a chunk more text. Does this work for you?
>
>   Section 6 of [RFC4271] describes the handling of malformed BGP
>   attributes, or those that are in error in some way.  [RFC7606]
>   revises BGP error handling specifically for the for UPDATE message,
>   provides guidelines for the authors of documents defining new
>   attributes, and revises the error handling procedures for a number of
>   existing attributes.  This document introduces the SFP attribute and
>   so defines error handling as follows:
>
>   o  When parsing a message, an unknown Attribute Type code or a length
>      that suggests that the attribute is longer than the remaining
>      message is treated as a malformed message and the "treat-as-
>      withdraw" approach used as per [RFC7606].
>
>   o  When parsing a message that contains an SFP attribute, the
>      following cases constitute errors:
>
>      1.  Optional bit is set to 0 in SFP attribute.
>
>      2.  Transitive bit is set to 0 in SFP attribute.
>
>      3.  Unknown TLV type field found in SFP attribute.
>
>      4.  TLV length that suggests the TLV extends beyond the end of the
>          SFP attribute.
>
>      5.  Association TLV contains an unknown SFPR-RD.
>
>[SLI] That's a bit weird to find this here as the Association TLV hasn't been introduced.
> Wouldn't it be better to add a dedicated Error Handling section (e.g. 3.2.3) after all the encodings ?

I think we introduced it in the text at the top of the page (i.e. a few paragraphs earlier). It reads OK to me and it is better to group together the format handling issues in one place and with the text that describes the presence rules.

>      6.  No Hop TLV found in the SFP attribute.
>
>      7.  No SFT TLV found in a Hop TLV.
>
> [SLI] That's strange, as section 3.2.1.3 says that the SFT TLV MAY be included, so optional...

Ah, this should read "No sub-TLV found in a Hop TLV".
Per 3.2.1.2 "At least one sub-TLV MUST be present."

>     8.  Unknown SFIR-RD found in a Hop TLV.
>
>   o  The errors listed above are treated as follows:
>
>      1., 2., 6., 7.:  The attribute MUST be treated as malformed and
>         the "treat-as-withdraw" approach used as per [RFC7606].
>
>      3.:  Unknown TLVs SHOULD be ignored, and message processing SHOULD
>         continue.
>
>      4.:  Treated as a malformed message and the "treat-as-withdraw"
>         approach used as per [RFC7606]
>
>      5., 8.:  The absence of an RD with which to corollate is nothing
>         more than a soft error.  The receiver SHOULD store the
>         information from the SFP attribute until a corresponding
>         advertisement is received.  An implementation MAY time-out such
>         stored SFP attributes to avoid becoming over-loaded.
>
> [SLI] That's not really an error, there may be a lot of transient situations
>  where some routes haven't been learned yet leading to such situation. 
> I don't think that we need to time-out (even optionally) as timing out may
> create inconsistencies in the controlplane. What you could suggest is 
> alarming to let the user know that something wrong is happening.

Timing this out is a housekeeping thing. Without it there can be a bleed of router resources. Might not be large, but over time (and with a bugged implementation somewhere else in the network) it could add up.

But:
1. We should give guidance on the time-out. A largish time-out of the order of 30 minutes would be fine.
2. Yes, there should be a user notification when this happens.

So...

      5., 8.:  The absence of an RD with which to corollate is nothing
         more than a soft error.  The receiver SHOULD store the
         information from the SFP attribute until a corresponding
         advertisement is received.  An implementation MAY time-out such
         stored SFP attributes to avoid becoming over-loaded.  The time-out
         value should be configurable and measured in minutes; a default
         value of 30 minutes is suggested.  Whenever an implementation
         removes a stored SFP attribute, it SHOULD generate a notification
         to the network operator.

> * FLOWSPEC traffic steering:
>>>> NEW COMMENT:
>>>> Section 5:
>>>> "Note that each FlowSpec update MUST be tagged with the route target
>>>>  of the overlay or VPN network for which it is intended."
>>> [SLI] You should be more clear that VPN-IPv4 and VPN-IPv6 Flowspec
>>> families must be used, it's not just a matter of RTs.
>>
>> A couple of the authors have discussed this a bit and we are puzzled.
>>
>> RFC 5575 section 8 discusses the applicability of Flowspecs to VPNs.
>> https://www.iana.org/assignments/flow-spec/flow-spec.xhtml#flow-spec-2 
>> does not list any VPN Flowspecs.
>> draft-ietf-pce-pcep-flowspec makes observations about VPN identification
>> and applicability to Flowspecs.
>> draft-ietf-idr-flowspec-l2vpn has a redefinition of SAFI 134 to apply to 
>> Flowspecs to an L2VPN environment.
>>
>> [SLI2] I see various cases:
>> - traffic is coming from an IPVPNv4 and should be steered on an SFC, in such
>>   a case RFC5575 Section 8 (AFI/SAFI 1/134) must be used, an RT is attached.
>> - traffic is coming from the global routing table and should be steered on an
>>   SFC, in such a case  the base RFC5575 using AFI/SAFI 1/133 must be used
>>   and there is no RT attached. The trick here is that you need to set in the
>>   action: - the SFC you want the traffic to be steered on as well as the VPN
>>   to look the SFC for (Like a redirect RT + SFC steering). If there is no VPN
>>   redirection, the SFC is considered to be available in the global routing table.
>> - traffic is coming from an L2VPN, this is similar to the L3VPN case.
>> - same considerations applies to IPv6 and VPNv6.
>
> OK, I think we need to separate two things.
> Section 5 is concerned with how to select among possible next hops on an
> SFP. That is, the packet is already classified and assigned to an SFP, but
> some load-balancing choices have to be made. So I don't think Section 5 is
> the place for this discussion.
>
> However, Section 7.4 is about classifying traffic onto SFPs. Specifically, it is
> about how to indicate, using BGP, which traffic flows should be assigned
> to which SPF at the Classifier component of the SFC system. This section
> seems to be relevant to the question you are asking.
>
> But this second section seems to have it all covered by modelling exactly
> the flowspec function used in BGP flow specification and adding an
> extended community for SFC.
>
> [SLI] The comment was for section 7.4 (not section 5, sorry for the bad pointer).
> Only the last sentence is raising a concern on my side. It makes me think that
> it only applies to FlowSpec VPN (1/134) while it could be used in many use 
> cases. As you are just adding a new action extended community, maybe you
> can just remove the last sentence. Adding RTs or not will depend on the
> context the action will be used in. We may need a review from the IDR guys
> on this section. 

There are two separate things:
a) Place traffic from a VPN onto a specific SFC (use 1/134)
b) Place traffic onto an SFC that flows through a specific overlay or VPN 
   (use RT)

We can do...
OLD
   Note that each FlowSpec update MUST be tagged with the route target
   of the overlay or VPN network for which it is intended to put the
   indicated SPI into context.
NEW
   One of the filters that the Flow Spec may describe is the VPN to 
   which the traffic belongs.  Additionally, note that to put the
   indicated SPI into context when multiple SFC overlays are present in
   one network, each FlowSpec update MUST be tagged with the route 
   target of the overlay or VPN network for which it is intended.
END


You also had a separate email exchange with me about FlowSpec actions. You flagged this up with the IDR chairs and it was noted that rfc5575bis says:
   Some traffic action communities may interfere with each other.
   Section 7.6 of this specification provides general considerations on
   such traffic action interference.  Any additional definition of a
   traffic actions specified by additional standards documents or vendor
   documents MUST specify if the traffic action interacts with an
   existing traffic actions, and provide error handling per [RFC7606].
John Scudder sent a note to the IDR list highlighting section 7.4 and asking for any input, but there was no comment raised.

However, we have returned to the relevant text and discussed it with the co-author who requested what was in earlier versions. We are not all in agreement that the text should be:

OLD
   [RFC5575] defines a set of BGP routes that can be used to identify
   the packets in a given flow using fields in the header of each
   packet, and a set of actions, encoded as extended communities, that
   can be used to disposition those packets.  This document enables the
   use of RFC 5575 mechanisms by SFC Classifiers by defining a new
   action extended community called "Flow Spec for SFC classifiers"
   identified by the value TBD4.  Note that other action extended
   communities may also be present.
NEW
    [RFC5575]  and [I-D.ietf-idr-rfc5575bis] define a set of BGP routes
    that can be used to identify the packets in a given flow using fields
    in the header of each packet, and a set of actions, encoded as
    extended communities, that can be used to disposition those
    packets.  This document enables the use of these mechanisms by
    SFC Classifiers by defining a new action extended community called
    "Flow Spec for SFC Classifiers" identified by the value TBD4.  Note
    that other action extended communities MUST NOT be present at
    the same time: the inclusion of the "Flow Spec for SFC Classifiers"
    action extended community along with any other action MUST be 
    treated as an error which SHOULD result in the Flow Specification
    UPDATE message being handled as Treat-as-withdraw according to
    [RFC7606] Section 2.
END

I am sending that specific change to IDR as well.

The -11 version will be posted SOON.

Best,
Adrian







_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.