Re: [nvo3] BFD over VXLAN: Trapping BFD Control packet at VTEP

Greg Mirsky <gregimirsky@gmail.com> Tue, 22 October 2019 19:55 UTC

Return-Path: <gregimirsky@gmail.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1706F1200EC; Tue, 22 Oct 2019 12:55:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.997
X-Spam-Level:
X-Spam-Status: No, score=-1.997 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Akt9c81be9cy; Tue, 22 Oct 2019 12:55:41 -0700 (PDT)
Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 40319120096; Tue, 22 Oct 2019 12:55:41 -0700 (PDT)
Received: by mail-lj1-x233.google.com with SMTP id l21so18527529lje.4; Tue, 22 Oct 2019 12:55:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=TXir5xn3UQxmBSmNc26/jBh3Rzdt55paSsQE33hbGzk=; b=QGns+q9ZxwMyh6lHx1UieUn2+gRoqvH+NafiD5WR3lCf/GyEMWF3RtLneHpSwquPNd 8h3Z4ubfd1iyR0mAQhqGsej1K3E/JULZjbWdS0EyuoPJ99845mIabnAsH31yL01rf5gT AaCwsf1fDxr/Kw5yqbrfnBHSMVH9OkitncnUlTuccZzH4ljw4gBGI8ozfnQkI2VsPq9H HedIM8zXC4CvM/K8wQWitrU8aJNnJoA605N10o+Ho8Oeb53UQ+53Jn12+5JJYFSdyn5h G0+F6gsYvzbNx4+T/BzWnd8qFi68jm9UlJ6o8iUoo+qjxyjGp+gBqORQMrOwC3ONVlxI aSAA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=TXir5xn3UQxmBSmNc26/jBh3Rzdt55paSsQE33hbGzk=; b=jQET+AqHbmk1nkoXGBkJ9KkEkOTB4jBKnsLrXALdC1xDaUzM2WpP67C2rOQBqmfBtV TBQl6V5pM8uxBFfripyN5dBLQGEx09HYNl/WMlGMleqRGNKo/JeuFFMr02K13oUEmR3m 4H0MEC/VVR92ffT2I+ds+seN3tirl3pIS0nasc2ZfU7zljc3IJCLvayYb4yBJR11VRiG Etw3ZGgmGgmlRCH/mX/tUd56trbD4NFRyWFlm8WziQuE2RhdubHFKD4qX3JzfjrFf17f Zq6VWFIizDUokAoOrLGaS44ucsmBFJc2au7lTxjJW2nNHS/CoyI6e2gEZFHDXRkeGMW8 g67w==
X-Gm-Message-State: APjAAAXTtBp/OsHC87sowNmqRp6+KDG45o0cYtVfVnGVej+aeXp4G1fy 9Tjo1GacN/6J5bo+B+pRPSpSzXDK19Ln1r0427g=
X-Google-Smtp-Source: APXvYqyK7DwQMA6HRShY2Ak/AY7ywAAhqHU4Q8eNftYL9NOCj20BvbizehBNPPjLrhZFWp+IlfWqoBlyozSiyqZO40M=
X-Received: by 2002:a2e:9205:: with SMTP id k5mr18769988ljg.246.1571774139291; Tue, 22 Oct 2019 12:55:39 -0700 (PDT)
MIME-Version: 1.0
References: <CACi9rdu8PKsLW_Pq4ww5DEwLL8Bs6Hq1Je_jmAjES4LKBuE8MQ@mail.gmail.com> <201909251039413767352@zte.com.cn> <CACi9rdv-760M8WgZ1mOOOa=yoJqQFP=vdc3xJKLe7wCR18NSvA@mail.gmail.com> <20191021210752.GA8916@pfrc.org> <0e99a541-b2ca-85d4-4a8f-1165cf7ac01e@joelhalpern.com> <CA+-tSzziDc+Tk8AYfOr5-Xn6oO_uqW2C1dRA9LLOBBVmzVhWEQ@mail.gmail.com> <CA+RyBmVcBgeoGc2z5Gv0grv8OY34tyw+T-T-W2vn1O3AxCSQ9Q@mail.gmail.com> <0b45df12-a7c5-3b5c-db59-5a57c8dfd1b7@joelhalpern.com> <CA+RyBmV9Ynk6fZy6qkvkOz3Pm2AmK7ESy8KoEpqyxP1nvNka0w@mail.gmail.com> <14ec7c38-5a5b-83dd-b4f4-71a29494ebdc@joelhalpern.com>
In-Reply-To: <14ec7c38-5a5b-83dd-b4f4-71a29494ebdc@joelhalpern.com>
From: Greg Mirsky <gregimirsky@gmail.com>
Date: Tue, 22 Oct 2019 15:55:28 -0400
Message-ID: <CA+RyBmWDRrjTR3OAnYsush8+4ORnGdKUqp46bg5MXaPa3zCgZA@mail.gmail.com>
Subject: Re: [nvo3] BFD over VXLAN: Trapping BFD Control packet at VTEP
To: "Joel M. Halpern" <jmh@joelhalpern.com>
Cc: Anoop Ghanwani <anoop@alumni.duke.edu>, Jeffrey Haas <jhaas@pfrc.org>, Santosh P K <santosh.pallagatti@gmail.com>, NVO3 <nvo3@ietf.org>, draft-ietf-bfd-vxlan@ietf.org, Dinesh Dutt <didutt@gmail.com>, rtg-bfd WG <rtg-bfd@ietf.org>, "T. Sridhar" <tsridhar@vmware.com>, xiao.min2@zte.com.cn
Content-Type: multipart/alternative; boundary="0000000000003d5cd60595852f98"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/ae9IPwaqKukHIdG_TrZ01IFy1nU>
X-Mailman-Approved-At: Tue, 22 Oct 2019 14:45:07 -0700
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Oct 2019 19:55:47 -0000

Hi Joel,
RFC 7348 suggests using information from the inner packet to calculate the
value to be used in the Source UDP port number:
      -  Source Port:  It is recommended that the UDP source port number
         be calculated using a hash of fields from the inner packet --
         one example being a hash of the inner Ethernet frame's headers.
         This is to enable a level of entropy for the ECMP/load-
         balancing of the VM-to-VM traffic across the VXLAN overlay.
>From that text, I assume that VNI may be used as input for hashing
function. If BFD over VXLAN doesn't support per VNI BFD session, then it
cannot monitor multiple paths in underlay used to balance VM-to-VM traffic
between the same pair of VTEPs. In my opinion, this is perfectly fine if
that is WG's agreement. I'm glad we are discussing this and will have a
conclusion.

Regards,
Greg

On Tue, Oct 22, 2019 at 3:30 PM Joel M. Halpern <jmh@joelhalpern.com> wrote:

> As I recall, the VNI is not in the same place nor the same size as the
> TCP / UDP ports.  So it seems very unlikely that it would be used in
> ECMP.  In fact, avoiding that is why VXLAN does interesting things with
> the source UDP port.  Which the BFD can do.  And presumably MUST do if
> it was path matching.
>
> Yours,
> Joel
>
> On 10/22/2019 3:16 PM, Greg Mirsky wrote:
> > Hi Joel,
> > if the underlay may balance VXLAN between two VTEPs using VNI in
> > addition to other fields, then Option 2 has a certain value in my
> opinion.
> >
> > Regards,
> > Greg
> >
> > On Tue, Oct 22, 2019 at 3:06 PM Joel M. Halpern <jmh@joelhalpern.com
> > <mailto:jmh@joelhalpern.com>> wrote:
> >
> >     I do not understand the value of option 2.
> >     Which is why I asked in my initial review to move to option 1.
> >
> >     And option 2 requires stealing MAC addresses from the users, which
> >     seems
> >     to me to be a very bad thing that option 1 avoids.
> >
> >     Yours,
> >     Joel
> >
> >     On 10/22/2019 2:17 PM, Greg Mirsky wrote:
> >      > Hi Anoop, et al.,
> >      > I agree with your understanding of what is being defined in the
> >     current
> >      > version of the BFD over VxLAN specification. But, as I
> >     understand, the
> >      > WG is discussing the scope before the WGLC is closed. I believe
> >     there
> >      > are three options:
> >      >
> >      >  1. single BFD session between two VTEPs
> >      >  2. single BFD session per VNI between two VTEPs
> >      >  3. multiple BFD sessions per VNI between two VTEPs
> >      >
> >      > The current text reflects #2. Is WG accepts this scope? If not,
> >     which
> >      > option WG would accept?
> >      >
> >      > Regards,
> >      > Greg
> >      >
> >      > On Tue, Oct 22, 2019 at 2:09 PM Anoop Ghanwani
> >     <anoop@alumni.duke.edu <mailto:anoop@alumni.duke.edu>
> >      > <mailto:anoop@alumni.duke.edu <mailto:anoop@alumni.duke.edu>>>
> wrote:
> >      >
> >      >     I concur with Joel's assessment with the following
> >     clarifications.
> >      >
> >      >     The current document is already capable of monitoring
> >     multiple VNIs
> >      >     between VTEPs.
> >      >
> >      >     The issue under discussion was how do we use BFD to monitor
> >     multiple
> >      >     VAPs that use the same VNI between a pair of VTEPs.  The use
> case
> >      >     for this is not clear to me, as from my understanding, we
> cannot
> >      >     have a situation with multiple VAPs using the same VNI--there
> >     is 1:1
> >      >     mapping between VAP and VNI.
> >      >
> >      >     Anoop
> >      >
> >      >     On Tue, Oct 22, 2019 at 6:06 AM Joel M. Halpern
> >     <jmh@joelhalpern.com <mailto:jmh@joelhalpern.com>
> >      >     <mailto:jmh@joelhalpern.com <mailto:jmh@joelhalpern.com>>>
> wrote:
> >      >
> >      >           From what I can tell, there are two separate problems.
> >      >         The document we have is a VTEP-VTEP monitoring document.
> >     There
> >      >         is no
> >      >         need for that document to handle the multiple VNI case.
> >      >         If folks want a protocol for doing BFD monitoring of
> things
> >      >         behind the
> >      >         VTEPs (multiple VNIs), then do that as a separate
> >     document.   The
> >      >         encoding will be a tenant encoding, and thus sesparate
> >     from what is
> >      >         defined in this document.
> >      >
> >      >         Yours,
> >      >         Joel
> >      >
> >      >         On 10/21/2019 5:07 PM, Jeffrey Haas wrote:
> >      >          > Santosh and others,
> >      >          >
> >      >          > On Thu, Oct 03, 2019 at 07:50:20PM +0530, Santosh P K
> >     wrote:
> >      >          >>     Thanks for your explanation. This helps a lot. I
> >     would
> >      >         wait for more
> >      >          >> comments from others to see if this what we need in
> this
> >      >         draft to be
> >      >          >> supported based on that we can provide appropriate
> >     sections
> >      >         in the draft.
> >      >          >
> >      >          > The threads on the list have spidered to the point
> >     where it
> >      >         is challenging
> >      >          > to follow what the current status of the draft is, or
> >     should
> >      >         be.  :-)
> >      >          >
> >      >          > However, if I've followed things properly, the
> >     question below
> >      >         is really the
> >      >          > hinge point on what our encapsulation for BFD over
> vxlan
> >      >         should look like.
> >      >          > Correct?
> >      >          >
> >      >          > Essentially, do we or do we not require the ability to
> >     permit
> >      >         multiple BFD
> >      >          > sessions between distinct VAPs?
> >      >          >
> >      >          > If this is so, do we have a sense as to how we should
> >     proceed?
> >      >          >
> >      >          > -- Jeff
> >      >          >
> >      >          > [context preserved below...]
> >      >          >
> >      >          >> Santosh P K
> >      >          >>
> >      >          >> On Wed, Sep 25, 2019 at 8:10 AM <xiao.min2@zte.com.cn
> >     <mailto:xiao.min2@zte.com.cn>
> >      >         <mailto:xiao.min2@zte.com.cn
> >     <mailto:xiao.min2@zte.com.cn>>> wrote:
> >      >          >>
> >      >          >>> Hi Santosh,
> >      >          >>>
> >      >          >>>
> >      >          >>> With regard to the question whether we should allow
> >      >         multiple BFD sessions
> >      >          >>> for the same VNI or not, IMHO we should allow it,
> more
> >      >         explanation as
> >      >          >>> follows.
> >      >          >>>
> >      >          >>> Below is a figure derived from figure 2 of RFC8014
> (An
> >      >         Architecture for
> >      >          >>> Data-Center Network Virtualization over Layer 3
> (NVO3)).
> >      >          >>>
> >      >          >>>                      |         Data Center Network
> (IP)
> >      >              |
> >      >          >>>                      |
> >      >             |
> >      >          >>>
> >      >         +-----------------------------------------+
> >      >          >>>                           |
>  |
> >      >          >>>                           |       Tunnel Overlay
>   |
> >      >          >>>              +------------+---------+
> >      >           +---------+------------+
> >      >          >>>              | +----------+-------+ |       |
> >      >         +-------+----------+ |
> >      >          >>>              | |  Overlay Module  | |       | |
> Overlay
> >      >         Module  | |
> >      >          >>>              | +---------+--------+ |       |
> >      >         +---------+--------+ |
> >      >          >>>              |           |          |       |
> >         |
> >      >                  |
> >      >          >>>       NVE1   |           |          |       |
> >         |
> >      >                  | NVE2
> >      >          >>>              |  +--------+-------+  |       |
> >      >         +--------+-------+  |
> >      >          >>>              |  |VNI1 VNI2  VNI1 |  |       |  |
> >     VNI1 VNI2
> >      >         VNI1 |  |
> >      >          >>>              |  +-+-----+----+---+  |       |
> >      >         +-+-----+-----+--+  |
> >      >          >>>              |VAP1| VAP2|    | VAP3 |       |VAP1|
> >     VAP2|
> >      >           | VAP3|
> >      >          >>>              +----+-----+----+------+
> >      >           +----+-----+-----+-----+
> >      >          >>>                   |     |    |                   |
> >         |     |
> >      >          >>>                   |     |    |                   |
> >         |     |
> >      >          >>>                   |     |    |                   |
> >         |     |
> >      >          >>>
> >      >         -------+-----+----+-------------------+-----+-----+-------
> >      >          >>>                   |     |    |     Tenant        |
> >         |     |
> >      >          >>>              TSI1 | TSI2|    | TSI3          TSI1|
> >     TSI2|
> >      >           |TSI3
> >      >          >>>                  +---+ +---+ +---+             +---+
> >     +---+
> >      >           +---+
> >      >          >>>                  |TS1| |TS2| |TS3|             |TS4|
> >     |TS5|
> >      >           |TS6|
> >      >          >>>                  +---+ +---+ +---+             +---+
> >     +---+
> >      >           +---+
> >      >          >>>
> >      >          >>> To my understanding, the BFD sessions between NVE1
> >     and NVE2
> >      >         are actually
> >      >          >>> initiated and terminated at VAP of NVE.
> >      >          >>>
> >      >          >>> If the network operator want to set up one BFD
> session
> >      >         between VAP1 of
> >      >          >>> NVE1 and VAP1of NVE2, at the same time another BFD
> >     session
> >      >         between VAP3 of
> >      >          >>> NVE1 and VAP3 of NVE2, although the two BFD sessions
> are
> >      >         for the same
> >      >          >>> VNI1, I believe it's reasonable, so that's why I
> >     think we
> >      >         should allow it
> >      >
> >      >         _______________________________________________
> >      >         nvo3 mailing list
> >      > nvo3@ietf.org <mailto:nvo3@ietf.org> <mailto:nvo3@ietf.org
> >     <mailto:nvo3@ietf.org>>
> >      > https://www.ietf.org/mailman/listinfo/nvo3
> >      >
> >
>