Re: [nvo3] BFD over VXLAN: Trapping BFD Control packet at VTEP

Dinesh Dutt <didutt@gmail.com> Tue, 22 October 2019 20:18 UTC

Return-Path: <didutt@gmail.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6DE94120077; Tue, 22 Oct 2019 13:18:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.796
X-Spam-Level:
X-Spam-Status: No, score=-0.796 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, NML_ADSP_CUSTOM_MED=0.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=fail (2048-bit key) reason="fail (body has been altered)" header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ngm2FVDvv83G; Tue, 22 Oct 2019 13:17:59 -0700 (PDT)
Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 79C40120096; Tue, 22 Oct 2019 13:17:59 -0700 (PDT)
Received: by mail-pf1-x444.google.com with SMTP id b4so2403893pfr.12; Tue, 22 Oct 2019 13:17:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:subject:to:cc:message-id:in-reply-to:references :mime-version; bh=186BaC3JxKThDPedcs3Xs8hTpR0eVG7ydqckl9kzcik=; b=pmfq7x7DLQwg7FwWH9PUhV9HC3ePuHocCZlwWd6h0Jv1kUYARcB1jKIIW11sRSe4Zl G1Td5UU7s3hXWnTpK+r2rROuAyraJk8JgWiJG5+de1eQohXfy7JzU3NQnXfeibT67Y7/ qaBEZiJbso5dYLxrtY9lh/xcuY+DIvu60PzKUTzaQDC3RIIjde3ehy05Sw0YaAiaEHIn hhDwKHiuO3ipo+v6TJsAaW3ksm9fUItb909ehcRogkntzqWqwnTGKM4yfpvAJorcimHJ ZiDn/VdokQ0a+0ofYPe3BvfzvwjPHkDNnk9AtfG0WgHx52fitzb1nrj9qf0sZQneU5n3 UkIQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:subject:to:cc:message-id:in-reply-to :references:mime-version; bh=186BaC3JxKThDPedcs3Xs8hTpR0eVG7ydqckl9kzcik=; b=h0m/R93jMB3W9Iu8uuAzNQilMRshUoW/Zb9nBDDEU/EynNX+nb4TkvqKi4NxhdJEZo AoHayfdgQP5s1IIrL8jc2SPAud8DCEDemLSuI45u/9mE3N9sn9WTi5hOl1qu9HiKPJ/3 05Qpu2Un+A880WHYiCyTAE7uZR0MyK5FA4ojka8eX4A9oJm0CS8A6rjbjy9KXheg2Kve KxKZbARYP252FTIYZqWfzHpowl/6igSTVbK0lCGaZxYKnY9rKWI0InjT/u3VmqrPOTnV RagrnsG1vcOiH52Bpv2Z/sGKA203nzCcHzz/b8wXaviFicBB9M4+CeXDToc8kRoYO9lI JQmQ==
X-Gm-Message-State: APjAAAVLU0qQMzOYbHhsnNfhipkNvOgpQ1hVu5u5xGzs6fmWcIcWN1j7 V5cZ0J7jVMb7/ZugqZYmDeQ=
X-Google-Smtp-Source: APXvYqxhK8NxgvtGn9OTIJ/kT8eGQ7AAt0yQbB2wX65Ascc0Ia8CbliBQ+/yQX+sCq/+bmfvSF3dpQ==
X-Received: by 2002:a17:90a:3608:: with SMTP id s8mr6984045pjb.44.1571775478685; Tue, 22 Oct 2019 13:17:58 -0700 (PDT)
Received: from [192.168.0.108] ([117.216.128.128]) by smtp.gmail.com with ESMTPSA id k124sm19051341pga.83.2019.10.22.13.17.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2019 13:17:57 -0700 (PDT)
Date: Wed, 23 Oct 2019 01:17:50 +0500
From: Dinesh Dutt <didutt@gmail.com>
Subject: Re: [nvo3] BFD over VXLAN: Trapping BFD Control packet at VTEP
To: Joel Halpern Direct <jmh.direct@joelhalpern.com>
Cc: Greg Mirsky <gregimirsky@gmail.com>, Anoop Ghanwani <anoop@alumni.duke.edu>, Jeffrey Haas <jhaas@pfrc.org>, Santosh P K <santosh.pallagatti@gmail.com>, NVO3 <nvo3@ietf.org>, draft-ietf-bfd-vxlan@ietf.org, rtg-bfd WG <rtg-bfd@ietf.org>, "T. Sridhar" <tsridhar@vmware.com>, xiao.min2@zte.com.cn
Message-Id: <1571775471.10436.0@smtp.gmail.com>
In-Reply-To: <ba234410-ba08-d9a5-0399-edd3901a60a6@joelhalpern.com>
References: <CACi9rdu8PKsLW_Pq4ww5DEwLL8Bs6Hq1Je_jmAjES4LKBuE8MQ@mail.gmail.com> <201909251039413767352@zte.com.cn> <CACi9rdv-760M8WgZ1mOOOa=yoJqQFP=vdc3xJKLe7wCR18NSvA@mail.gmail.com> <20191021210752.GA8916@pfrc.org> <0e99a541-b2ca-85d4-4a8f-1165cf7ac01e@joelhalpern.com> <CA+-tSzziDc+Tk8AYfOr5-Xn6oO_uqW2C1dRA9LLOBBVmzVhWEQ@mail.gmail.com> <CA+RyBmVcBgeoGc2z5Gv0grv8OY34tyw+T-T-W2vn1O3AxCSQ9Q@mail.gmail.com> <0b45df12-a7c5-3b5c-db59-5a57c8dfd1b7@joelhalpern.com> <CA+RyBmV9Ynk6fZy6qkvkOz3Pm2AmK7ESy8KoEpqyxP1nvNka0w@mail.gmail.com> <14ec7c38-5a5b-83dd-b4f4-71a29494ebdc@joelhalpern.com> <CA+RyBmWDRrjTR3OAnYsush8+4ORnGdKUqp46bg5MXaPa3zCgZA@mail.gmail.com> <ba234410-ba08-d9a5-0399-edd3901a60a6@joelhalpern.com>
X-Mailer: geary/0.12.4
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="=-xJzZpfWt/2dbFpwpkoOO"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/vJjv6nO_GLziT-hDDrkjo1Wh9RI>
X-Mailman-Approved-At: Tue, 22 Oct 2019 14:45:07 -0700
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Oct 2019 20:18:02 -0000

oel,

I'm a tad frustrated that we're rehashing this discussions all over 
again. I specifically explained all the questions that were raised at 
that time. Let me try one last time.

1. BFD for VTEP is only useful for testing VXLAN plumbing, not the 
underlay itself.
2. So, the question is what do we use for VNI and the inner header?
3. The inner header is an IP packet because it is BFD. The IP address 
and the corresponding MAC address used MUST be one that is owned by the 
VTEP in the VNI that is used in the packet

This is sufficient to come up with an implementation that only ever 
tests one VNI or multiple VNIs between thr same pair of VTEPs. It is 
upto the users to decide what VNI, inner MAC and IP to use. The only 
restriction is that the VTEP must own those addresses to (i) prevent 
the packet from leaking to tenants and (ii) allow the tenants 
themselves to be running BFD.

If implementations want to use VNI 1 as the recommended default VNI to 
use, that is fine. But if implementations want to pick more than 1 
because they have a need to do so (I've seen operators do this because 
of their specific use case), then they can as long as they satisfy 
point 3, the draft is done. Why does there need to be any more 
discussion? The draft does need to spell out that using more than VNI 
has scaling issues that the user needs to be aware of, and it does.

Dinesh

On Wed, Oct 23, 2019 at 1:28 AM, Joel Halpern Direct 
<jmh.direct@joelhalpern.com> wrote:
> That is input to the calculation at the VTEP.  It is NOT information 
> sued by the network between the VTEPs.
> 
> As such, the VTEPs can emulate that by adjusting the source ports 
> that it uses for the VFD packets.  The network does not need the VNI 
> to actually be varied to achieve this purpose.
> 
> Yours,
> Joel
> 
> On 10/22/2019 3:55 PM, Greg Mirsky wrote:
>> Hi Joel,
>> RFC 7348 suggests using information from the inner packet to 
>> calculate the value to be used in the Source UDP port number:
>>        -  Source Port:  It is recommended that the UDP source port 
>> number
>>           be calculated using a hash of fields from the inner packet 
>> --
>>           one example being a hash of the inner Ethernet frame's 
>> headers.
>>           This is to enable a level of entropy for the ECMP/load-
>>           balancing of the VM-to-VM traffic across the VXLAN overlay.
>>  From that text, I assume that VNI may be used as input for hashing 
>> function. If BFD over VXLAN doesn't support per VNI BFD session, 
>> then it cannot monitor multiple paths in underlay used to balance 
>> VM-to-VM traffic between the same pair of VTEPs. In my opinion, 
>> this is perfectly fine if that is WG's agreement. I'm glad we are 
>> discussing this and will have a conclusion.
>> 
>> Regards,
>> Greg
>> 
>> On Tue, Oct 22, 2019 at 3:30 PM Joel M. Halpern <jmh@joelhalpern.com 
>> <mailto:jmh@joelhalpern.com>> wrote:
>> 
>>     As I recall, the VNI is not in the same place nor the same size 
>> as the
>>     TCP / UDP ports.  So it seems very unlikely that it would be 
>> used in
>>     ECMP.  In fact, avoiding that is why VXLAN does interesting 
>> things with
>>     the source UDP port.  Which the BFD can do.  And presumably MUST 
>> do if
>>     it was path matching.
>> 
>>     Yours,
>>     Joel
>> 
>>     On 10/22/2019 3:16 PM, Greg Mirsky wrote:
>>      > Hi Joel,
>>      > if the underlay may balance VXLAN between two VTEPs using VNI 
>> in
>>      > addition to other fields, then Option 2 has a certain value 
>> in my
>>     opinion.
>>      >
>>      > Regards,
>>      > Greg
>>      >
>>      > On Tue, Oct 22, 2019 at 3:06 PM Joel M. Halpern
>>     <jmh@joelhalpern.com <mailto:jmh@joelhalpern.com>
>>      > <mailto:jmh@joelhalpern.com <mailto:jmh@joelhalpern.com>>> 
>> wrote:
>>      >
>>      >     I do not understand the value of option 2.
>>      >     Which is why I asked in my initial review to move to 
>> option 1.
>>      >
>>      >     And option 2 requires stealing MAC addresses from the 
>> users,
>>     which
>>      >     seems
>>      >     to me to be a very bad thing that option 1 avoids.
>>      >
>>      >     Yours,
>>      >     Joel
>>      >
>>      >     On 10/22/2019 2:17 PM, Greg Mirsky wrote:
>>      >      > Hi Anoop, et al.,
>>      >      > I agree with your understanding of what is being 
>> defined
>>     in the
>>      >     current
>>      >      > version of the BFD over VxLAN specification. But, as I
>>      >     understand, the
>>      >      > WG is discussing the scope before the WGLC is closed. I
>>     believe
>>      >     there
>>      >      > are three options:
>>      >      >
>>      >      >  1. single BFD session between two VTEPs
>>      >      >  2. single BFD session per VNI between two VTEPs
>>      >      >  3. multiple BFD sessions per VNI between two VTEPs
>>      >      >
>>      >      > The current text reflects #2. Is WG accepts this 
>> scope? If
>>     not,
>>      >     which
>>      >      > option WG would accept?
>>      >      >
>>      >      > Regards,
>>      >      > Greg
>>      >      >
>>      >      > On Tue, Oct 22, 2019 at 2:09 PM Anoop Ghanwani
>>      >     <anoop@alumni.duke.edu <mailto:anoop@alumni.duke.edu>
>>     <mailto:anoop@alumni.duke.edu <mailto:anoop@alumni.duke.edu>>
>>      >      > <mailto:anoop@alumni.duke.edu
>>     <mailto:anoop@alumni.duke.edu> <mailto:anoop@alumni.duke.edu
>>     <mailto:anoop@alumni.duke.edu>>>> wrote:
>>      >      >
>>      >      >     I concur with Joel's assessment with the following
>>      >     clarifications.
>>      >      >
>>      >      >     The current document is already capable of 
>> monitoring
>>      >     multiple VNIs
>>      >      >     between VTEPs.
>>      >      >
>>      >      >     The issue under discussion was how do we use BFD to
>>     monitor
>>      >     multiple
>>      >      >     VAPs that use the same VNI between a pair of 
>> VTEPs.     The use case
>>      >      >     for this is not clear to me, as from my 
>> understanding,
>>     we cannot
>>      >      >     have a situation with multiple VAPs using the same
>>     VNI--there
>>      >     is 1:1
>>      >      >     mapping between VAP and VNI.
>>      >      >
>>      >      >     Anoop
>>      >      >
>>      >      >     On Tue, Oct 22, 2019 at 6:06 AM Joel M. Halpern
>>      >     <jmh@joelhalpern.com <mailto:jmh@joelhalpern.com>
>>     <mailto:jmh@joelhalpern.com <mailto:jmh@joelhalpern.com>>
>>      >      >     <mailto:jmh@joelhalpern.com
>>     <mailto:jmh@joelhalpern.com> <mailto:jmh@joelhalpern.com
>>     <mailto:jmh@joelhalpern.com>>>> wrote:
>>      >      >
>>      >      >           From what I can tell, there are two separate
>>     problems.
>>      >      >         The document we have is a VTEP-VTEP monitoring
>>     document.
>>      >     There
>>      >      >         is no
>>      >      >         need for that document to handle the multiple 
>> VNI
>>     case.
>>      >      >         If folks want a protocol for doing BFD 
>> monitoring
>>     of things
>>      >      >         behind the
>>      >      >         VTEPs (multiple VNIs), then do that as a 
>> separate
>>      >     document.   The
>>      >      >         encoding will be a tenant encoding, and thus 
>> sesparate
>>      >     from what is
>>      >      >         defined in this document.
>>      >      >
>>      >      >         Yours,
>>      >      >         Joel
>>      >      >
>>      >      >         On 10/21/2019 5:07 PM, Jeffrey Haas wrote:
>>      >      >          > Santosh and others,
>>      >      >          >
>>      >      >          > On Thu, Oct 03, 2019 at 07:50:20PM +0530,
>>     Santosh P K
>>      >     wrote:
>>      >      >          >>     Thanks for your explanation. This 
>> helps a
>>     lot. I
>>      >     would
>>      >      >         wait for more
>>      >      >          >> comments from others to see if this what we
>>     need in this
>>      >      >         draft to be
>>      >      >          >> supported based on that we can provide 
>> appropriate
>>      >     sections
>>      >      >         in the draft.
>>      >      >          >
>>      >      >          > The threads on the list have spidered to 
>> the point
>>      >     where it
>>      >      >         is challenging
>>      >      >          > to follow what the current status of the 
>> draft
>>     is, or
>>      >     should
>>      >      >         be.  :-)
>>      >      >          >
>>      >      >          > However, if I've followed things properly, 
>> the
>>      >     question below
>>      >      >         is really the
>>      >      >          > hinge point on what our encapsulation for 
>> BFD
>>     over vxlan
>>      >      >         should look like.
>>      >      >          > Correct?
>>      >      >          >
>>      >      >          > Essentially, do we or do we not require the
>>     ability to
>>      >     permit
>>      >      >         multiple BFD
>>      >      >          > sessions between distinct VAPs?
>>      >      >          >
>>      >      >          > If this is so, do we have a sense as to how 
>> we
>>     should
>>      >     proceed?
>>      >      >          >
>>      >      >          > -- Jeff
>>      >      >          >
>>      >      >          > [context preserved below...]
>>      >      >          >
>>      >      >          >> Santosh P K
>>      >      >          >>
>>      >      >          >> On Wed, Sep 25, 2019 at 8:10 AM
>>     <xiao.min2@zte.com.cn <mailto:xiao.min2@zte.com.cn>
>>      >     <mailto:xiao.min2@zte.com.cn 
>> <mailto:xiao.min2@zte.com.cn>>
>>      >      >         <mailto:xiao.min2@zte.com.cn
>>     <mailto:xiao.min2@zte.com.cn>
>>      >     <mailto:xiao.min2@zte.com.cn 
>> <mailto:xiao.min2@zte.com.cn>>>>
>>     wrote:
>>      >      >          >>
>>      >      >          >>> Hi Santosh,
>>      >      >          >>>
>>      >      >          >>>
>>      >      >          >>> With regard to the question whether we 
>> should
>>     allow
>>      >      >         multiple BFD sessions
>>      >      >          >>> for the same VNI or not, IMHO we should 
>> allow
>>     it, more
>>      >      >         explanation as
>>      >      >          >>> follows.
>>      >      >          >>>
>>      >      >          >>> Below is a figure derived from figure 2 of
>>     RFC8014 (An
>>      >      >         Architecture for
>>      >      >          >>> Data-Center Network Virtualization over 
>> Layer
>>     3 (NVO3)).
>>      >      >          >>>
>>      >      >          >>>                      |         Data Center
>>     Network (IP)
>>      >      >              |
>>      >      >          >>>                      |
>>      >      >             |
>>      >      >          >>>
>>      >      >         +-----------------------------------------+
>>      >      >          >>>                           |               
>>                 |
>>      >      >          >>>                           |       Tunnel
>>     Overlay      |
>>      >      >          >>>              +------------+---------+
>>      >      >           +---------+------------+
>>      >      >          >>>              | +----------+-------+ |     
>>   |
>>      >      >         +-------+----------+ |
>>      >      >          >>>              | |  Overlay Module  | |     
>>   |
>>     |  Overlay
>>      >      >         Module  | |
>>      >      >          >>>              | +---------+--------+ |     
>>   |
>>      >      >         +---------+--------+ |
>>      >      >          >>>              |           |          |     
>>   |
>>      >         |
>>      >      >                  |
>>      >      >          >>>       NVE1   |           |          |     
>>   |
>>      >         |
>>      >      >                  | NVE2
>>      >      >          >>>              |  +--------+-------+  |     
>>   |
>>      >      >         +--------+-------+  |
>>      >      >          >>>              |  |VNI1 VNI2  VNI1 |  |     
>>   |  |
>>      >     VNI1 VNI2
>>      >      >         VNI1 |  |
>>      >      >          >>>              |  +-+-----+----+---+  |     
>>   |
>>      >      >         +-+-----+-----+--+  |
>>      >      >          >>>              |VAP1| VAP2|    | VAP3 |     
>>       |VAP1|
>>      >     VAP2|
>>      >      >           | VAP3|
>>      >      >          >>>              +----+-----+----+------+
>>      >      >           +----+-----+-----+-----+
>>      >      >          >>>                   |     |    |            
>>            |
>>      >         |     |
>>      >      >          >>>                   |     |    |            
>>            |
>>      >         |     |
>>      >      >          >>>                   |     |    |            
>>            |
>>      >         |     |
>>      >      >          >>>
>>      >      >             
>> -------+-----+----+-------------------+-----+-----+-------
>>      >      >          >>>                   |     |    |     Tenant 
>>            |
>>      >         |     |
>>      >      >          >>>              TSI1 | TSI2|    | TSI3       
>>       TSI1|
>>      >     TSI2|
>>      >      >           |TSI3
>>      >      >          >>>                  +---+ +---+ +---+        
>>          +---+
>>      >     +---+
>>      >      >           +---+
>>      >      >          >>>                  |TS1| |TS2| |TS3|        
>>          |TS4|
>>      >     |TS5|
>>      >      >           |TS6|
>>      >      >          >>>                  +---+ +---+ +---+        
>>          +---+
>>      >     +---+
>>      >      >           +---+
>>      >      >          >>>
>>      >      >          >>> To my understanding, the BFD sessions 
>> between
>>     NVE1
>>      >     and NVE2
>>      >      >         are actually
>>      >      >          >>> initiated and terminated at VAP of NVE.
>>      >      >          >>>
>>      >      >          >>> If the network operator want to set up one
>>     BFD session
>>      >      >         between VAP1 of
>>      >      >          >>> NVE1 and VAP1of NVE2, at the same time
>>     another BFD
>>      >     session
>>      >      >         between VAP3 of
>>      >      >          >>> NVE1 and VAP3 of NVE2, although the two 
>> BFD
>>     sessions are
>>      >      >         for the same
>>      >      >          >>> VNI1, I believe it's reasonable, so 
>> that's why I
>>      >     think we
>>      >      >         should allow it
>>      >      >
>>      >      >         _______________________________________________
>>      >      >         nvo3 mailing list
>>      >      > nvo3@ietf.org <mailto:nvo3@ietf.org> 
>> <mailto:nvo3@ietf.org
>>     <mailto:nvo3@ietf.org>> <mailto:nvo3@ietf.org 
>> <mailto:nvo3@ietf.org>
>>      >     <mailto:nvo3@ietf.org <mailto:nvo3@ietf.org>>>
>>      >      > https://www.ietf.org/mailman/listinfo/nvo3
>>      >      >
>>      >
>>