Re: WGLC comments on draft-ietf-bfd-vxlan

Anoop Ghanwani <anoop@alumni.duke.edu> Thu, 15 November 2018 07:00 UTC

Return-Path: <ghanwani@gmail.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E430712872C; Wed, 14 Nov 2018 23:00:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.419
X-Spam-Level:
X-Spam-Status: No, score=-0.419 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.25, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_FONT_FACE_BAD=0.981, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V-Q394159X6H; Wed, 14 Nov 2018 23:00:54 -0800 (PST)
Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CA34E129C6B; Wed, 14 Nov 2018 23:00:52 -0800 (PST)
Received: by mail-vs1-f45.google.com with SMTP id x1so11079410vsc.10; Wed, 14 Nov 2018 23:00:52 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mExr95USFZi4gdzWR3PBBa31rVIlARfU3ROd7AzSfjU=; b=a2PMH4xAADAByIKoCej50DqwdDyzk0xqc/ngjAZGnjboUdiG89kW8XbzL92OWipng1 K4PjbV3nHKnAqNJo9X+Ve2VjVwzUmepBssUvmondEs2sGa5xHH/KqR5H/gh0bHUXF2RN J2p0zK7+OIYeIT+8UbdDjCTKDqT5NiozF0cGidTnLHnQBAZ0SWjswH/xqRWpg1ADcwuQ L+2jfUw2x6ZKqG4e1vJoo4Ze3vBUNjHzFhuhePNPxUl1SV9AIVVqiyI3CgTuokM5TK6F 7ScH3BLCKFmVvS95ObRQZNeFOFHmPF5feq9qpKVhgetmJzPWa+HBNsshibn3gImEHaCZ Bd9w==
X-Gm-Message-State: AGRZ1gLgOk4NKp3T81+PG750sLUnf94rhU7e2FREBr8VDRmd+FB2Ntgj 7zVRxe5DhrPYITwoNzAJGzdz6jxVtGBKPaisn+olWGBU
X-Google-Smtp-Source: AJdET5f9PU+uEnOVHz2PPjcfYTcVuYfO8BlvzgNSc+qCrkXAsurVGa/coCNZFR1TY8YGhIveZ/ItQTK35/Yc1ZX/dRU=
X-Received: by 2002:a67:7993:: with SMTP id u141mr2269100vsc.119.1542265251422; Wed, 14 Nov 2018 23:00:51 -0800 (PST)
MIME-Version: 1.0
References: <CA+-tSzxFxtVo6NbfSw4wzb--fSuN4zsSvX7R58iiYFgVF5cA6Q@mail.gmail.com> <CA+RyBmVXeCYAZhWTy-g6U_EJ7NOFQwV4twJaJ-7_LT5_wKFGFw@mail.gmail.com> <CA+-tSzxQp2x0hpAF253b9yKL1aD1J1CaGHs7T6VE8zuvg25R_Q@mail.gmail.com> <CA+RyBmXoOKS-Nq7bDfsgDZXou5-FcprEQeVkhWhAD4_1MoHqUQ@mail.gmail.com>
In-Reply-To: <CA+RyBmXoOKS-Nq7bDfsgDZXou5-FcprEQeVkhWhAD4_1MoHqUQ@mail.gmail.com>
From: Anoop Ghanwani <anoop@alumni.duke.edu>
Date: Wed, 14 Nov 2018 23:00:37 -0800
Message-ID: <CA+-tSzzgKyfXzE+=eVLz7B3u1X_HFahQ6GCFTbL+-rfjsR03uA@mail.gmail.com>
Subject: Re: WGLC comments on draft-ietf-bfd-vxlan
To: Greg Mirsky <gregimirsky@gmail.com>
Cc: rtg-bfd@ietf.org, nvo3@ietf.org
Content-Type: multipart/alternative; boundary="00000000000075e4cf057aae9cd0"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/SMjL9RxdYZHZzrTcTjAF0y6B8Ks>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Nov 2018 07:00:58 -0000

Hi Greg,

Please see inline prefixed with [ag2].

Thanks,
Anoop

On Wed, Nov 14, 2018 at 9:45 AM Greg Mirsky <gregimirsky@gmail.com> wrote:

> Hi Anoop,
> thank you for the expedient response. I am glad that some of my responses
> have addressed your concerns. Please find followup notes in-line tagged
> GIM2>>. I've attached the diff to highlight the updates applied in the
> working version. Let me know if these are acceptable changes.
>
> Regards,
> Greg
>
> On Tue, Nov 13, 2018 at 12:30 PM Anoop Ghanwani <anoop@alumni.duke.edu>
> wrote:
>
>> Hi Greg,
>>
>> Please see inline prefixed with [ag].
>>
>> Thanks,
>> Anoop
>>
>> On Tue, Nov 13, 2018 at 11:34 AM Greg Mirsky <gregimirsky@gmail.com>
>> wrote:
>>
>>> Hi Anoop,
>>> many thanks for the thorough review and detailed comments. Please find
>>> my answers, this time for real, in-line tagged GIM>>.
>>>
>>> Regards,
>>> Greg
>>>
>>> On Thu, Nov 8, 2018 at 1:58 AM Anoop Ghanwani <anoop@alumni.duke.edu>
>>> wrote:
>>>
>>>>
>>>> Here are my comments.
>>>>
>>>> Thanks,
>>>> Anoop
>>>>
>>>> ==
>>>>
>>>> Philosophical
>>>>
>>>> Since VXLAN is not an IETF standard, should we be defining a standard
>>>> for running BFD on it?  Should we define BFD over Geneve instead which is
>>>> the official WG selection?  Is that going to be a separate document?
>>>> GIM>> IS-IS is not on the Standard track either but that had not
>>>> prevented IETF from developing tens of standard track RFCs using RFC 1142
>>>> as the normative reference until RFC 7142 re-classified it as historical. A
>>>> similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
>>>> obsoleted by RFC 5305 four years later. I understand that Down Reference,
>>>> i.e., using informational RFC as the normative reference, is not an unusual
>>>> situation.
>>>>
>>>
>> [ag] OK.  I'm not an expert on this part so unless someone else that is
>> an expert (chairs, AD?) can comment on it, I'll just let it go.
>>
>>
>>>
>>>
>>>>
>>>> Technical
>>>>
>>>> Section 1:
>>>>
>>>> This part needs to be rewritten:
>>>> >>>
>>>> The individual racks may be part of a different Layer 3 network, or
>>>> they could be in a single Layer 2 network. The VXLAN segments/overlays are
>>>> overlaid on top of Layer 3 network. A VM can communicate with another VM
>>>> only if they are on the same VXLAN segment.
>>>> >>>
>>>> It's hard to parse and, given IRB,
>>>>
>>> GIM>> Would the following text be acceptable:
>>> OLD TEXT:
>>>    VXLAN is typically deployed in data centers interconnecting
>>>    virtualized hosts, which may be spread across multiple racks.  The
>>>    individual racks may be part of a different Layer 3 network, or they
>>>    could be in a single Layer 2 network.  The VXLAN segments/overlays
>>>    are overlaid on top of Layer 3 network.
>>> NEW TEXT:
>>> VXLAN is typically deployed in data centers interconnecting virtualized
>>> hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
>>> Layer 3 data center network infrastructure in the presence of VMs in
>>> a multi-tenant environment, discussed in section 3 [RFC7348], by
>>>  providing Layer 2 overlay scheme on a Layer 3 network.
>>>
>>
>> [ag] This is a lot better.
>>
>>
>>>
>>>  A VM can communicate with another VM only if they are on the same
>>> VXLAN segment.
>>>>
>>>> the last sentence above is wrong.
>>>>
>>> GIM>> Section 4 in RFC 7348 states:
>>> Only VMs within the same VXLAN segment can communicate with each other.
>>>
>>
>> [ag] VMs on different segments can communicate using routing/IRB, so even
>> RFC 7348 is wrong.  Perhaps the text should be modified so say -- "In the
>> absence of a router in the overlay, a VM can communicate...".
>>
>>
>>>
>>> Section 3:
>>>> >>>
>>>>  Most deployments will have VMs with only L2 capabilities that
>>>> may not support L3.
>>>> >>>
>>>> Are you suggesting most deployments have VMs with no IP
>>>> addresses/configuration?
>>>>
>>> GIM>> Would re-word as follows:
>>> OLD TEXT:
>>>  Most deployments will have VMs with only L2 capabilities that
>>>  may not support L3.
>>> NEW TEXT:
>>> Deployments may have VMs with only L2 capabilities that do not support
>>> L3.
>>>
>>
>> [ag] I still don't understand this.  What does it mean for a VM to not
>> support L3?  No IP address, no default GW, something else?
>>
> GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
> tunnel. VM is not required to have IP address as it is VTEP's IP address
> that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN
> gateway as the device that forwards traffice between VXLAN and non-VXLAN
> domains. Considering all that, would the following change be acceptable:
> OLD TEXT:
>  Most deployments will have VMs with only L2 capabilities that
>  may not support L3.
> NEW TEXT:
>  Most deployments will have VMs with only L2 capabilities and not have an
> IP address assigned.
>

[ag2] Do you have a reference for this (i.e. that most deployments have VMs
without an IP address)?  Normally I would think VMs would have an IP
address.  It's just that they are segregated into segments and, without an
intervening router, they are restricted to communicate only within their
subnet.

>
>>
>>>
>>>> >>>
>>>> Having a hierarchical OAM model helps localize faults though it
>>>> requires additional consideration.
>>>> >>>
>>>> What are the additional considerations?
>>>>
>>> GIM>> For example, coordination of BFD intervals across the OAM layers.
>>>
>>>
>>
>> [ag] Can we mention them in the draft?
>>
>>
>>>
>>>> Would be useful to add a reference to RFC 8293 in case the reader would
>>>> like to know more about service nodes.
>>>>
>>> GIM>> I have to admit that I don't find how RFC 8293  A Framework for
>>> Multicast in Network Virtualization over Layer 3 is related to this
>>> document. Please help with additional reference to the text of the
>>> document.
>>>
>>
>> [ag] The RFC discusses the use of service nodes which is mentioned here.
>>
>>
>>>
>>>> Section 4
>>>> >>>
>>>> Separate BFD sessions can be established between the VTEPs (IP1 and
>>>> IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200).
>>>> >>>
>>>> IMO, the document should mention that this could lead to scaling issues
>>>> given that VTEPs can support well in excess of 4K VNIs.  Additionally, we
>>>> should mention that with IRB, a given VNI may not even exist on the
>>>> destination VTEP.  Finally, what is the benefit of doing this?  There may
>>>> be certain corner cases where it's useful (vs a single BFD session between
>>>> the VTEPs for all VNIs) but it would be good to explain what those are.
>>>>
>>> GIM>> Will add text in the Security Considerations section that VTEPs
>>> should have limit on number of BFD sessions.
>>>
>>
>> [ag] I was hoping for two things:
>> - A mention about the scalability issue right where per-VNI BFD is
>> discussed.  (Not sure why that is a security issue/consideration.)
>>
> GIM2>> I've added the following sentense in both places:
> The implementation SHOULD have a reasonable upper bound on the number of
> BFD sessions that can be created between the same pair of VTEPs.
>

[ag2] What is the criteria for determining what is reasonable?


> - What is the benefit of running BFD per VNI between a pair of VTEPs?
>>
> GIM2>> An alternative would be to run CFM between VMs, if there's the need
> to monitor liveliness of the particular VM. Again, this is optional.
>

[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one to
monitor the liveliness of VMs.


>
>>
>>>
>>>> Sections 5.1 and 6.1
>>>>
>>>> In 5.1 we have
>>>> >>>
>>>> The inner MAC frame carrying the BFD payload has the
>>>> following format:
>>>> ... Source IP: IP address of the originating VTEP. Destination IP: IP
>>>> address of the terminating VTEP.
>>>> >>>
>>>>
>>>> In 6.1 we have
>>>> >>>
>>>>
>>>> Since multiple BFD sessions may be running between two
>>>> VTEPs, there needs to be a mechanism for demultiplexing received BF
>>>>
>>>> packets to the proper session.  The procedure for demultiplexing
>>>> packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
>>>>
>>>> *For such packets, the BFD session MUST be identified*
>>>>
>>>> *using the inner headers, i.e., the source IP and the destination IP
>>>> present in the IP header carried by the payload of the VXLAN*
>>>>
>>>> *encapsulated packet.*
>>>>
>>>>
>>>> >>>
>>>> How does this work if the source IP and dest IP are the same as
>>>> specified in 5.1?
>>>>
>>> GIM>> You're right, Destination and source IP addresses likely are the
>>> same in this case. Will add that the source UDP port number, along with the
>>> pair of IP addresses, MUST be used to demux received BFD control packets.
>>> Would you agree that will be sufficient?
>>>
>>
>> [ag] Yes, I think that should work.
>>
>>>
>>>> Editorial
>>>>
>>>
>> [ag] Agree with all comments on this section.
>>
>>>
>>>> - Terminology section should be renamed to acronyms.
>>>>
>>> GIM>> Accepted
>>>
>>>> - Document would benefit from a thorough editorial scrub, but maybe
>>>> that will happen once it gets to the RFC editor.
>>>>
>>> GIM>> Will certainly have helpful comments from ADs and RFC editor.
>>>
>>>>
>>>> Section 1
>>>> >>>
>>>> "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
>>>> <https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
>>>> scheme that allows virtual machines (VMs) to communicate in a data center
>>>> network.
>>>> >>>
>>>> This is not accurate.  VXLAN allows you to implement an overlay to
>>>> decouple the address space of the attached hosts from that of the network.
>>>>
>>> GIM>> Thank you for the suggested text. Will change as follows:
>>> OLD TEXT:
>>>    "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].  provides
>>>    an encapsulation scheme that allows virtual machines (VMs) to
>>>    communicate in a data center network.
>>> NEW TEXT:
>>>  "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].  provides
>>>    an encapsulation scheme that allows building an overlay network by
>>>   decoupling the address space of the attached virtual hosts from that
>>> of the network.
>>>
>>>>
>>>> Section 7
>>>>
>>>> VTEP's -> VTEPs
>>>>
>>> GIM>> Yes, thank you.
>>>
>>