Re: [bess] Shepherd's review of draft-ietf-bess-nsh-bgp-control-plane-06

"Adrian Farrel" <adrian@olddog.co.uk> Thu, 30 May 2019 14:26 UTC

Return-Path: <adrian@olddog.co.uk>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8E9E0120130; Thu, 30 May 2019 07:26:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.597
X-Spam-Level:
X-Spam-Status: No, score=-2.597 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EipbX-XzlmmV; Thu, 30 May 2019 07:26:11 -0700 (PDT)
Received: from mta7.iomartmail.com (mta7.iomartmail.com [62.128.193.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8BA511200C1; Thu, 30 May 2019 07:26:11 -0700 (PDT)
Received: from vs2.iomartmail.com (vs2.iomartmail.com [10.12.10.123]) by mta7.iomartmail.com (8.14.4/8.14.4) with ESMTP id x4UDjSYN020694; Thu, 30 May 2019 14:45:28 +0100
Received: from vs2.iomartmail.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 613F022044; Thu, 30 May 2019 14:45:28 +0100 (BST)
Received: from asmtp1.iomartmail.com (unknown [10.12.10.248]) by vs2.iomartmail.com (Postfix) with ESMTPS id 4BDDA22042; Thu, 30 May 2019 14:45:28 +0100 (BST)
Received: from LAPTOPK7AS653V ([87.112.172.175]) (authenticated bits=0) by asmtp1.iomartmail.com (8.14.4/8.14.4) with ESMTP id x4UDjQVL026821 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 30 May 2019 14:45:27 +0100
Reply-To: adrian@olddog.co.uk
From: Adrian Farrel <adrian@olddog.co.uk>
To: stephane.litkowski@orange.com, draft-ietf-bess-nsh-bgp-control-plane@ietf.org
Cc: bess@ietf.org
References: <6687_1551262912_5C7664C0_6687_242_18_9E32478DFA9976438E7A22F69B08FF924C199D40@OPEXCAUBMA3.corporate.adroot.infra.ftgroup> <090901d4d063$75aa6cf0$60ff46d0$@olddog.co.uk> <30790_1551796864_5C7E8A80_30790_14_1_9E32478DFA9976438E7A22F69B08FF924C19B882@OPEXCAUBMA3.corporate.adroot.infra.ftgroup> <036f01d4d42e$0ec01340$2c4039c0$@olddog.co.uk> <2409_1551949080_5C80DD18_2409_404_24_9E32478DFA9976438E7A22F69B08FF924C1A4AF2@OPEXCAUBMA3.corporate.adroot.infra.ftgroup> <070e01d4fac4$8478a4a0$8d69ede0$@olddog.co.uk> <6812_1556525855_5CC6B31F_6812_378_8_9E32478DFA9976438E7A22F69B08FF924C1F0D2D@OPEXCAUBMA3.corporate.adroot.infra.ftgroup>
In-Reply-To: <6812_1556525855_5CC6B31F_6812_378_8_9E32478DFA9976438E7A22F69B08FF924C1F0D2D@OPEXCAUBMA3.corporate.adroot.infra.ftgroup>
Date: Thu, 30 May 2019 14:45:26 +0100
Organization: Old Dog Consulting
Message-ID: <038b01d516ed$f0a0fc00$d1e2f400$@olddog.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQJikq86JmpEvAOqlfHnTU/L7Rjo2gJ6iJ9tAbgP62gC7YCKqQJdKw4nAf6Yr4kCyT/gnaTzeh5w
Content-Language: en-gb
X-Originating-IP: 87.112.172.175
X-Thinkmail-Auth: adrian@olddog.co.uk
X-TM-AS-GCONF: 00
X-TM-AS-Product-Ver: IMSVA-9.0.0.1623-8.2.0.1013-24646.007
X-TM-AS-Result: No--29.168-10.0-31-10
X-imss-scan-details: No--29.168-10.0-31-10
X-TMASE-Version: IMSVA-9.0.0.1623-8.2.1013-24646.007
X-TMASE-Result: 10--29.167800-10.000000
X-TMASE-MatchedRID: IeZYkn8zfFqWfDtBOz4q23FPUrVDm6jtekMgTOQbVFuZt08TfNy6OGtO JQXVIgjvPBFZl58NoSl2EDqBsAkj3nChPHB61wQjvOAv94sAIMT+paX6bXuNYcuSXx71bvSL1jb VdYnx1YZXjcJkmKWzkThR0f9CtiymrWVE2n6N3Jm0UrZmU8TPEeoy1W2B8cuZNdIRQsBMxLl4I1 XWYyL/jus4vNFkXSFsj5R8Z6SCRAMdpkGVXFtcWsn9tWHiLD2G9l9p8mNlkglYbPLopoBzQiPwn gHOGku/s8ba3CtGfkx6Lanuhsl/yD09o8PhVkSde6Hz0ZpSb5dteYiRfHhOq1B3AZ+9IiUHkk0E cjtU9XCN47s7xkkWcLQd+RaWLRzGx/Z5eXpUUhRuh7qwx+D6T/moZ6x4ZgCUb59dURD98Z5Qjev vFeK6vRwl9HWd0QKrGzAwH5u0+ag6QhjH/ZLU0EmSRRbSc9s3rXkuON8pnlFdxIUfjmzjnyVCmT vDV2+nfm/UJ4HU2+7UlUXqI6dEdRa/wYm95ohbUeavKZUnS5AvsOOmgOo1mTQ881NaGbKn7t4do jVOmerPoEuGW/f+qjylzZM+awSYuzL8Te+hcvKdonALKgiNvA5vuI6Hi/n9Y/PCoq9toYlWtNd6 xYUn6GD1BW/jImijQ6RISlMgkIyf7q8h1pwXagrgwFF/sjumChdI4sLlrji0/7y02KVJzFfum9H IfLkwsOdGwRjxgDzU2bZ31Ud1ehqzknFt+CwbbIrwDhknjFna7r6g8x1D7mGVufr+eY4vcZqWsA Ma9O7dTAmCcpzVLrPq7OlP/XzXXu21eWHPNTaeAiCmPx4NwFkMvWAuahr8m5N2YHMD0b8MyrfP9 j+C1d934/rDAK3zUc1+O1X9AzE=
X-TMASE-SNAP-Result: 1.821001.0001-0-1-12:0,22:0,33:0,34:0-0
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/tKeTWqguK1kxI3Vy0ycQWnE1dy4>
Subject: Re: [bess] Shepherd's review of draft-ietf-bess-nsh-bgp-control-plane-06
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 May 2019 14:26:15 -0000

Hi Stephane,

Thanks again for the thoroughness of your review and the time it has taken to herd the necessary cats.

* BGP ERROR HANDLING:
>>>> I don’t see the “error handling” behavior associated with this attribute
>>>> (discard, treat-as-withdraw…)
>>>
>>> I think the errors are covered by section 6 of RFC 4271, but we need to
>>> point to it.
>>>
>>> [SLI] You have added " Malformed SFP attributes, or those that in error in
>>> some way, MUST be handled as described in Section 6 of [RFC4271]"
>>> This is not enough ad RFC7606 allows for a more "graceful" process of 
>>> errors and it's up to each new attribute to have its own behavior in term
>>> of error processing. RFC7606 has some guidelines.
>>
>> This one will take a little more time to work up some text.
>> We'll get back to you.
>>
>> [SLI2] This is an important thing to address, and the IESG or Routing Directorate
>> may catch that as well.
>
> OK, a chunk more text. Does this work for you?
>
>   Section 6 of [RFC4271] describes the handling of malformed BGP
>   attributes, or those that are in error in some way.  [RFC7606]
>   revises BGP error handling specifically for the for UPDATE message,
>   provides guidelines for the authors of documents defining new
>   attributes, and revises the error handling procedures for a number of
>   existing attributes.  This document introduces the SFP attribute and
>   so defines error handling as follows:
>
>   o  When parsing a message, an unknown Attribute Type code or a length
>      that suggests that the attribute is longer than the remaining
>      message is treated as a malformed message and the "treat-as-
>      withdraw" approach used as per [RFC7606].
>
>   o  When parsing a message that contains an SFP attribute, the
>      following cases constitute errors:
>
>      1.  Optional bit is set to 0 in SFP attribute.
>
>      2.  Transitive bit is set to 0 in SFP attribute.
>
>      3.  Unknown TLV type field found in SFP attribute.
>
>      4.  TLV length that suggests the TLV extends beyond the end of the
>          SFP attribute.
>
>      5.  Association TLV contains an unknown SFPR-RD.
>
>[SLI] That's a bit weird to find this here as the Association TLV hasn't been introduced.
> Wouldn't it be better to add a dedicated Error Handling section (e.g. 3.2.3) after all the encodings ?

I think we introduced it in the text at the top of the page (i.e. a few paragraphs earlier). It reads OK to me and it is better to group together the format handling issues in one place and with the text that describes the presence rules.

>      6.  No Hop TLV found in the SFP attribute.
>
>      7.  No SFT TLV found in a Hop TLV.
>
> [SLI] That's strange, as section 3.2.1.3 says that the SFT TLV MAY be included, so optional...

Ah, this should read "No sub-TLV found in a Hop TLV".
Per 3.2.1.2 "At least one sub-TLV MUST be present."

>     8.  Unknown SFIR-RD found in a Hop TLV.
>
>   o  The errors listed above are treated as follows:
>
>      1., 2., 6., 7.:  The attribute MUST be treated as malformed and
>         the "treat-as-withdraw" approach used as per [RFC7606].
>
>      3.:  Unknown TLVs SHOULD be ignored, and message processing SHOULD
>         continue.
>
>      4.:  Treated as a malformed message and the "treat-as-withdraw"
>         approach used as per [RFC7606]
>
>      5., 8.:  The absence of an RD with which to corollate is nothing
>         more than a soft error.  The receiver SHOULD store the
>         information from the SFP attribute until a corresponding
>         advertisement is received.  An implementation MAY time-out such
>         stored SFP attributes to avoid becoming over-loaded.
>
> [SLI] That's not really an error, there may be a lot of transient situations
>  where some routes haven't been learned yet leading to such situation. 
> I don't think that we need to time-out (even optionally) as timing out may
> create inconsistencies in the controlplane. What you could suggest is 
> alarming to let the user know that something wrong is happening.

Timing this out is a housekeeping thing. Without it there can be a bleed of router resources. Might not be large, but over time (and with a bugged implementation somewhere else in the network) it could add up.

But:
1. We should give guidance on the time-out. A largish time-out of the order of 30 minutes would be fine.
2. Yes, there should be a user notification when this happens.

So...

      5., 8.:  The absence of an RD with which to corollate is nothing
         more than a soft error.  The receiver SHOULD store the
         information from the SFP attribute until a corresponding
         advertisement is received.  An implementation MAY time-out such
         stored SFP attributes to avoid becoming over-loaded.  The time-out
         value should be configurable and measured in minutes; a default
         value of 30 minutes is suggested.  Whenever an implementation
         removes a stored SFP attribute, it SHOULD generate a notification
         to the network operator.

> * FLOWSPEC traffic steering:
>>>> NEW COMMENT:
>>>> Section 5:
>>>> "Note that each FlowSpec update MUST be tagged with the route target
>>>>  of the overlay or VPN network for which it is intended."
>>> [SLI] You should be more clear that VPN-IPv4 and VPN-IPv6 Flowspec
>>> families must be used, it's not just a matter of RTs.
>>
>> A couple of the authors have discussed this a bit and we are puzzled.
>>
>> RFC 5575 section 8 discusses the applicability of Flowspecs to VPNs.
>> https://www.iana.org/assignments/flow-spec/flow-spec.xhtml#flow-spec-2 
>> does not list any VPN Flowspecs.
>> draft-ietf-pce-pcep-flowspec makes observations about VPN identification
>> and applicability to Flowspecs.
>> draft-ietf-idr-flowspec-l2vpn has a redefinition of SAFI 134 to apply to 
>> Flowspecs to an L2VPN environment.
>>
>> [SLI2] I see various cases:
>> - traffic is coming from an IPVPNv4 and should be steered on an SFC, in such
>>   a case RFC5575 Section 8 (AFI/SAFI 1/134) must be used, an RT is attached.
>> - traffic is coming from the global routing table and should be steered on an
>>   SFC, in such a case  the base RFC5575 using AFI/SAFI 1/133 must be used
>>   and there is no RT attached. The trick here is that you need to set in the
>>   action: - the SFC you want the traffic to be steered on as well as the VPN
>>   to look the SFC for (Like a redirect RT + SFC steering). If there is no VPN
>>   redirection, the SFC is considered to be available in the global routing table.
>> - traffic is coming from an L2VPN, this is similar to the L3VPN case.
>> - same considerations applies to IPv6 and VPNv6.
>
> OK, I think we need to separate two things.
> Section 5 is concerned with how to select among possible next hops on an
> SFP. That is, the packet is already classified and assigned to an SFP, but
> some load-balancing choices have to be made. So I don't think Section 5 is
> the place for this discussion.
>
> However, Section 7.4 is about classifying traffic onto SFPs. Specifically, it is
> about how to indicate, using BGP, which traffic flows should be assigned
> to which SPF at the Classifier component of the SFC system. This section
> seems to be relevant to the question you are asking.
>
> But this second section seems to have it all covered by modelling exactly
> the flowspec function used in BGP flow specification and adding an
> extended community for SFC.
>
> [SLI] The comment was for section 7.4 (not section 5, sorry for the bad pointer).
> Only the last sentence is raising a concern on my side. It makes me think that
> it only applies to FlowSpec VPN (1/134) while it could be used in many use 
> cases. As you are just adding a new action extended community, maybe you
> can just remove the last sentence. Adding RTs or not will depend on the
> context the action will be used in. We may need a review from the IDR guys
> on this section. 

There are two separate things:
a) Place traffic from a VPN onto a specific SFC (use 1/134)
b) Place traffic onto an SFC that flows through a specific overlay or VPN 
   (use RT)

We can do...
OLD
   Note that each FlowSpec update MUST be tagged with the route target
   of the overlay or VPN network for which it is intended to put the
   indicated SPI into context.
NEW
   One of the filters that the Flow Spec may describe is the VPN to 
   which the traffic belongs.  Additionally, note that to put the
   indicated SPI into context when multiple SFC overlays are present in
   one network, each FlowSpec update MUST be tagged with the route 
   target of the overlay or VPN network for which it is intended.
END


You also had a separate email exchange with me about FlowSpec actions. You flagged this up with the IDR chairs and it was noted that rfc5575bis says:
   Some traffic action communities may interfere with each other.
   Section 7.6 of this specification provides general considerations on
   such traffic action interference.  Any additional definition of a
   traffic actions specified by additional standards documents or vendor
   documents MUST specify if the traffic action interacts with an
   existing traffic actions, and provide error handling per [RFC7606].
John Scudder sent a note to the IDR list highlighting section 7.4 and asking for any input, but there was no comment raised.

However, we have returned to the relevant text and discussed it with the co-author who requested what was in earlier versions. We are not all in agreement that the text should be:

OLD
   [RFC5575] defines a set of BGP routes that can be used to identify
   the packets in a given flow using fields in the header of each
   packet, and a set of actions, encoded as extended communities, that
   can be used to disposition those packets.  This document enables the
   use of RFC 5575 mechanisms by SFC Classifiers by defining a new
   action extended community called "Flow Spec for SFC classifiers"
   identified by the value TBD4.  Note that other action extended
   communities may also be present.
NEW
    [RFC5575]  and [I-D.ietf-idr-rfc5575bis] define a set of BGP routes
    that can be used to identify the packets in a given flow using fields
    in the header of each packet, and a set of actions, encoded as
    extended communities, that can be used to disposition those
    packets.  This document enables the use of these mechanisms by
    SFC Classifiers by defining a new action extended community called
    "Flow Spec for SFC Classifiers" identified by the value TBD4.  Note
    that other action extended communities MUST NOT be present at
    the same time: the inclusion of the "Flow Spec for SFC Classifiers"
    action extended community along with any other action MUST be 
    treated as an error which SHOULD result in the Flow Specification
    UPDATE message being handled as Treat-as-withdraw according to
    [RFC7606] Section 2.
END

I am sending that specific change to IDR as well.

The -11 version will be posted SOON.

Best,
Adrian