Re:[nvo3] BFD over VXLAN: Trapping BFD Control packet at VTEP

<xiao.min2@zte.com.cn> Thu, 26 September 2019 06:22 UTC

Return-Path: <xiao.min2@zte.com.cn>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 131B312080F; Wed, 25 Sep 2019 23:22:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.197
X-Spam-Level:
X-Spam-Status: No, score=-4.197 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7vradpkKzheH; Wed, 25 Sep 2019 23:22:39 -0700 (PDT)
Received: from mxhk.zte.com.cn (mxhk.zte.com.cn [63.217.80.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D2FC71200F4; Wed, 25 Sep 2019 23:22:37 -0700 (PDT)
Received: from mxct.zte.com.cn (unknown [192.168.164.215]) by Forcepoint Email with ESMTPS id 3EAE2A71F896381504EC; Thu, 26 Sep 2019 14:22:35 +0800 (CST)
Received: from mse-fl1.zte.com.cn (unknown [10.30.14.238]) by Forcepoint Email with ESMTPS id 2F92A723A32DF5F09ED9; Thu, 26 Sep 2019 14:22:35 +0800 (CST)
Received: from njxapp04.zte.com.cn ([10.41.132.203]) by mse-fl1.zte.com.cn with SMTP id x8Q6KxLr010020; Thu, 26 Sep 2019 14:20:59 +0800 (GMT-8) (envelope-from xiao.min2@zte.com.cn)
Received: from mapi (njxapp02[null]) by mapi (Zmail) with MAPI id mid201; Thu, 26 Sep 2019 14:20:59 +0800 (CST)
Date: Thu, 26 Sep 2019 14:20:59 +0800
X-Zmail-TransId: 2afa5d8c58cbfae9b7b1
X-Mailer: Zmail v1.0
Message-ID: <201909261420594906017@zte.com.cn>
In-Reply-To: <424bb1af-1ae9-5d60-c07c-3e53917821ae@joelhalpern.com>
References: 201909251039413767352@zte.com.cn, 424bb1af-1ae9-5d60-c07c-3e53917821ae@joelhalpern.com
Mime-Version: 1.0
From: xiao.min2@zte.com.cn
To: jmh@joelhalpern.com
Cc: rtg-bfd@ietf.org, nvo3@ietf.org, tsridhar@vmware.com, bfd-chairs@ietf.org
Subject: Re:[nvo3] BFD over VXLAN: Trapping BFD Control packet at VTEP
Content-Type: multipart/mixed; boundary="=====_001_next====="
X-MAIL: mse-fl1.zte.com.cn x8Q6KxLr010020
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/nnmfXbDgGpnsh_GRQFg66ssBPbQ>
X-Mailman-Approved-At: Thu, 26 Sep 2019 08:46:14 -0700
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Sep 2019 06:22:43 -0000

Hi Joel,






Thanks for your comments.


I don't think the VNI could own IP address and MAC address, if the BFD messages are originated and terminated at the VNI, then what addresses would be used by the BFD messages?


As to the VAP, RFC8014 defines it as below:


"On the NVE side, a VAP is a logical network port (virtual or physical) into a specific virtual network."
I interpret this definition as that the VAP could own IP address and MAC address, so I tend to believe the BFD messages are originated and terminated at the VAP.






p.s. I trimmed this mail to meet the size limit, my last mail is too big, which results in a warning from rtg-bfd-owner@ietf.org.






Best Regards,



Xiao Min



原始邮件



发件人:JoelM.Halpern <jmh@joelhalpern.com>
收件人:肖敏10093570;
抄送人:rtg-bfd@ietf.org <rtg-bfd@ietf.org>;nvo3@ietf.org <nvo3@ietf.org>;tsridhar@vmware.com <tsridhar@vmware.com>;bfd-chairs@ietf.org <bfd-chairs@ietf.org>;
日 期 :2019年09月25日 11:00
主 题 :Re: [nvo3] BFD over VXLAN: Trapping BFD Control packet at VTEP


As far as I can tell, the current document we have in front of us is 
explicit that the messages are originated and terminated at the VNI.  If 
you want some other behavior, then we need a document that describes 
that behaviors.

Yours,
Joel

On 9/24/2019 10:39 PM, xiao.min2@zte.com.cn wrote:
> Hi Santosh,
> 
> 
> With regard to the question whether we should allow multiple BFD 
> sessions for the same VNI or not, IMHO we should allow it, more 
> explanation as follows.
> 
> Below is a figure derived from figure 2 of RFC8014 (An Architecture for 
> Data-Center Network Virtualization over Layer 3 (NVO3)).
> 
>                      |         Data Center Network (IP)        |
>                      |                                         |
>                      +-----------------------------------------+
>                           |                           |
>                           |       Tunnel Overlay      |
>              +------------+---------+       +---------+------------+
>              | +----------+-------+ |       | +-------+----------+ |
>              | |  Overlay Module  | |       | |  Overlay Module  | |
>              | +---------+--------+ |       | +---------+--------+ |
>              |           |          |       |           |          |
>       NVE1   |           |          |       |           |          | NVE2
>              |  +--------+-------+  |       |  +--------+-------+  |
>              |  |VNI1 VNI2  VNI1 |  |       |  | VNI1 VNI2 VNI1 |  |
>              |  +-+-----+----+---+  |       |  +-+-----+-----+--+  |
>              |VAP1| VAP2|    | VAP3 |       |VAP1| VAP2|     | VAP3|
>              +----+-----+----+------+       +----+-----+-----+-----+
>                   |     |    |                   |     |     |
>                   |     |    |                   |     |     |
>                   |     |    |                   |     |     |
>            -------+-----+----+-------------------+-----+-----+-------
>                   |     |    |     Tenant        |     |     |
>              TSI1 | TSI2|    | TSI3          TSI1| TSI2|     |TSI3
>                  +---+ +---+ +---+             +---+ +---+   +---+
>                  |TS1| |TS2| |TS3|             |TS4| |TS5|   |TS6|
>                  +---+ +---+ +---+             +---+ +---+   +---+
> 
> To my understanding, the BFD sessions between NVE1 and NVE2 are actually 
> initiated and terminated at VAP of NVE.
> 
> If the network operator want to set up one BFD session between VAP1 of 
> NVE1 and VAP1of NVE2, at the same time another BFD session between VAP3 
> of NVE1 and VAP3 of NVE2, although the two BFD sessions are for the same 
> VNI1, I believe it's reasonable, so that's why I think we should allow it.
> 
> 
> Of course, in RFC8014 it also says:
> 
> "Note that two different Tenant Systems (and TSIs) attached to a common NVE can share a VAP (e.g., TS1 and TS2 in Figure 2) so long as they connect to the same Virtual Network."
> 
> Some people may argue that all Tenant Systems connecting to the same 
> Virtual Network MUST share one VAP, if that's true, then VAP1 and VAP3 
> should merge into one VAP and my explanation doesn't work. Copying to 
> NVO3 WG to involve more experts, hope for your clarifications and comments.
> 
> 
> Best Regards,
> 
> Xiao Min
> 
> 原始邮件
> *发件人:*SantoshPK <santosh.pallagatti@gmail.com>
> *收件人:*Greg Mirsky <gregimirsky@gmail.com>;
> *抄送人:*draft-ietf-bfd-vxlan@ietf.org 
> <draft-ietf-bfd-vxlan@ietf.org>;Dinesh Dutt <didutt@gmail.com>;rtg-bfd 
> WG <rtg-bfd@ietf.org>;Joel M. Halpern <jmh@joelhalpern.com>;T. Sridhar 
> <tsridhar@vmware.com>;bfd-chairs@ietf.org <bfd-chairs@ietf.org>;
> *日 期 :*2019年09月23日 05:39
> *主 题 :**Re: BFD over VXLAN: Trapping BFD Control packet at VTEP*
> Greg,
>      Please see inline reply tagged [SPK]. I have added text requested.
> 
> Thanks
> Santosh P K
> 
> On Fri, Aug 16, 2019 at 4:59 AM Greg Mirsky <gregimirsky@gmail...com 
> <mailto:gregimirsky@gmail.com>> wrote:
> 
>     Hi Santosh,
>     thank you for your comments. Please find my notes in-lined and
>     tagged GIM>>.
> 
>     Regards,
>     Greg
> 
>     On Tue, Aug 13, 2019 at 10:24 PM Santosh P K
>     <santosh.pallagatti@gmail.com <mailto:santosh.pallagatti@gmail.com>>
>     wrote:
> 
>         Greg,
>             Thanks for updated version of document. Here are few
>         comments on new draft.
> 
>         Section 4:
>         Destination MAC: This MUST NOT be of one of tenant's MAC
>                   addresses.  The MAC address MAY be configured, or it
>         MAY be
>                   learned via a control plane protocol.  The details of
>         how the
>                   MAC address is obtained are outside the scope of this
>         document.
> 
>         I think we may need to give background on why we are saying MAC
>         address MUST not be one of tenant's MAC address. Like in this
>         thread we have discussed one of the tenant could have borrowed
>         the same VTEP mac address and we if we have to use BFD then we
>         need to avoid that conflict to ensure BFD packets get observed
>         in the VTEP itself. Should we add a section before 4 to set that
>         context so that above text makes more sense in that context?
> 
>     GIM>> Certainly. Please share the text you'd like to add. 
> 
> [SPK]  Proposed text for why we should not use VTEP MAX address as 
> tenant MAC address.
> 
> "In some scenarios tenant MAC is borrowed from VTEP MAC address. VXLAN 
> BFD MUST terminate BFD session at VTEP and MUST not forward BFD packets 
> to tenants. To terminate VXLAN BFD packets at VTEP, deployment MUST 
> ensure that there are no tenant VM which barrows VTEP MAC address."
> 
> 
> 
>             IP header:
>                   Destination IP: IP address MUST NOT be of one of
>         tenant's IP
>                   addresses.  IP address MAY be selected from the range
>         127/8 for
>                   IPv4, for IPv6 - from the range 0:0:0:0:0:FFFF:7F00:0/104.
> 
>                   TTL: MUST be set to 1 to ensure that the BFD packet is not
>                   routed within the L3 underlay network.
> 
> 
>         I think we have added some text to address Sridhar comments on
>         why TTL MUST be 1 and dest IP address MUST be 127/8 range. I see
>         that text is missing now.
> 
>     GIM>> My apologies that I've missed to include the text from another
>     discussion thread. I believe the following would be complete:
>                TTL or Hop Limit: MUST be set to 1 to ensure that the BFD
>               packet is not routed within the Layer 3 underlay network. 
>     This
>               addresses the scenario when the inner IP destination
>     address is
>               of VXLAN gateway and there is a router in underlay which
>               removes the VXLAN header, then it is possible to route the
>               packet as VXLAN  gateway address is routable address.
> 
> [SPK] This text looks good.
> 
> 
>         Section 5.1:
>         For such packets, the BFD session MUST be identified
>             using the following three-tuples of fields of the inner
>         header: the
>             source IP, the destination IP, and the source UDP port
>         number present
>             in the IP header carried by the payload of the packet in VXLAN
>             encapsulation.  If BFD packet is received with non-zero Your
>         Discriminator, then BFD session MUST be demultiplexed only with Your
>             Discriminator as the key.
> 
>         Just with 3 tuple we will not be able to demux packet. We need
>         to consider VNI as well if we have multiple BFD session between
>         same pair of VTEP.
> 
>     GIM>> This is one of comments from Carlos we need to address. Your
>     comment have helped me to form the question:
> 
>         What is the goal running multiple BFD sessions between the pair
>         of VTEPs?
> 
> [SPK] The goal of the multiple BFD session is to ensure check liveliness 
> of VXLAN tunnel. There is already a good amount of debate on this topic 
> that do we really need that? As per RFC 5884 we are running BFD per LSP 
> and we might hit scale issues there too. I think it is up to operator to 
> decide how they want to use multiple BFD session per VXLAN tunnel. It 
> could be possible that BFD session with special VNI is run at aggressive 
> interval where as MAY have multiple BFD sessions for different VNI at a 
> sedate interval, for that matter they could be running in demand mode as 
> well (run P/F sequence only when there is no data following for that 
> VNI). As WG if we think running multiple BFD session make sense then we 
> might need to add appropriate text.
> 
>     If the goal is to monitor per VNI, then the following text should
>     describe the demultiplexing of the initial BFD Control packet:
>         Demultiplexing of IP BFD packet has been defined in Section 3 of
>         [RFC5881].  Since multiple BFD sessions may be running between two
>         VTEPs, there needs to be a mechanism for demultiplexing received BFD
>         packets to the proper session.  For demultiplexing packets with Your
>         Discriminator equal to 0, a BFD session MUST be identified using the
>         logical link over which the BFD Control packet is received.  In the
>         case of VXLAN, the VNI number identifies that logical link.  If BFD
>         packet is received with non-zero Your Discriminator, then BFD
>     session
>         MUST be demultiplexed only with Your Discriminator as the key.
> 
> [SPK]  I think this text for multiple BFD session between same pair of 
> VTEPs for multiple VNI makes sense only if as WG we think that could be 
> use case.
> 
>     Would there be need to run multiple BFD sessions with the same VNI
>     number?
> 
> 
> [SPK] IMHO we should not allow multiple BFD session for the same VNI.
> 
> 
> 
> 
>         Thanks
>         Santosh P K

_______________________________________________
nvo3 mailing list
nvo3@ietf.org
https://www.ietf.org/mailman/listinfo/nvo3