Re: WGLC comments on draft-ietf-bfd-vxlan

Anoop Ghanwani <anoop@alumni.duke.edu> Thu, 15 November 2018 07:00 UTC

MIME-Version: 1.0
References: <CA+-tSzxFxtVo6NbfSw4wzb--fSuN4zsSvX7R58iiYFgVF5cA6Q@mail.gmail.com> <CA+RyBmVXeCYAZhWTy-g6U_EJ7NOFQwV4twJaJ-7_LT5_wKFGFw@mail.gmail.com> <CA+-tSzxQp2x0hpAF253b9yKL1aD1J1CaGHs7T6VE8zuvg25R_Q@mail.gmail.com> <CA+RyBmXoOKS-Nq7bDfsgDZXou5-FcprEQeVkhWhAD4_1MoHqUQ@mail.gmail.com>
In-Reply-To: <CA+RyBmXoOKS-Nq7bDfsgDZXou5-FcprEQeVkhWhAD4_1MoHqUQ@mail.gmail.com>
From: Anoop Ghanwani <anoop@alumni.duke.edu>
Date: Wed, 14 Nov 2018 23:00:37 -0800
Message-ID: <CA+-tSzzgKyfXzE+=eVLz7B3u1X_HFahQ6GCFTbL+-rfjsR03uA@mail.gmail.com>
Subject: Re: WGLC comments on draft-ietf-bfd-vxlan
To: Greg Mirsky <gregimirsky@gmail.com>
Cc: rtg-bfd@ietf.org, nvo3@ietf.org
Content-Type: multipart/alternative; boundary="00000000000075e4cf057aae9cd0"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/SMjL9RxdYZHZzrTcTjAF0y6B8Ks>
Precedence: list

Hi Greg,

Please see inline prefixed with [ag2].

Thanks,
Anoop

On Wed, Nov 14, 2018 at 9:45 AM Greg Mirsky <gregimirsky@gmail.com> wrote:

> Hi Anoop,
> thank you for the expedient response. I am glad that some of my responses
> have addressed your concerns. Please find followup notes in-line tagged
> GIM2>>. I've attached the diff to highlight the updates applied in the
> working version. Let me know if these are acceptable changes.
>
> Regards,
> Greg
>
> On Tue, Nov 13, 2018 at 12:30 PM Anoop Ghanwani <anoop@alumni.duke.edu>
> wrote:
>
>> Hi Greg,
>>
>> Please see inline prefixed with [ag].
>>
>> Thanks,
>> Anoop
>>
>> On Tue, Nov 13, 2018 at 11:34 AM Greg Mirsky <gregimirsky@gmail.com>
>> wrote:
>>
>>> Hi Anoop,
>>> many thanks for the thorough review and detailed comments. Please find
>>> my answers, this time for real, in-line tagged GIM>>.
>>>
>>> Regards,
>>> Greg
>>>
>>> On Thu, Nov 8, 2018 at 1:58 AM Anoop Ghanwani <anoop@alumni.duke.edu>
>>> wrote:
>>>
>>>>
>>>> Here are my comments.
>>>>
>>>> Thanks,
>>>> Anoop
>>>>
>>>> ==
>>>>
>>>> Philosophical
>>>>
>>>> Since VXLAN is not an IETF standard, should we be defining a standard
>>>> for running BFD on it?  Should we define BFD over Geneve instead which is
>>>> the official WG selection?  Is that going to be a separate document?
>>>> GIM>> IS-IS is not on the Standard track either but that had not
>>>> prevented IETF from developing tens of standard track RFCs using RFC 1142
>>>> as the normative reference until RFC 7142 re-classified it as historical. A
>>>> similar path was followed with IS-IS-TE by publishing RFC 3784 until it was
>>>> obsoleted by RFC 5305 four years later. I understand that Down Reference,
>>>> i.e., using informational RFC as the normative reference, is not an unusual
>>>> situation.
>>>>
>>>
>> [ag] OK.  I'm not an expert on this part so unless someone else that is
>> an expert (chairs, AD?) can comment on it, I'll just let it go.
>>
>>
>>>
>>>
>>>>
>>>> Technical
>>>>
>>>> Section 1:
>>>>
>>>> This part needs to be rewritten:
>>>> >>>
>>>> The individual racks may be part of a different Layer 3 network, or
>>>> they could be in a single Layer 2 network. The VXLAN segments/overlays are
>>>> overlaid on top of Layer 3 network. A VM can communicate with another VM
>>>> only if they are on the same VXLAN segment.
>>>> >>>
>>>> It's hard to parse and, given IRB,
>>>>
>>> GIM>> Would the following text be acceptable:
>>> OLD TEXT:
>>>    VXLAN is typically deployed in data centers interconnecting
>>>    virtualized hosts, which may be spread across multiple racks.  The
>>>    individual racks may be part of a different Layer 3 network, or they
>>>    could be in a single Layer 2 network.  The VXLAN segments/overlays
>>>    are overlaid on top of Layer 3 network.
>>> NEW TEXT:
>>> VXLAN is typically deployed in data centers interconnecting virtualized
>>> hosts of a tenant. VXLAN addresses requirements of the Layer 2 and
>>> Layer 3 data center network infrastructure in the presence of VMs in
>>> a multi-tenant environment, discussed in section 3 [RFC7348], by
>>>  providing Layer 2 overlay scheme on a Layer 3 network.
>>>
>>
>> [ag] This is a lot better.
>>
>>
>>>
>>>  A VM can communicate with another VM only if they are on the same
>>> VXLAN segment.
>>>>
>>>> the last sentence above is wrong.
>>>>
>>> GIM>> Section 4 in RFC 7348 states:
>>> Only VMs within the same VXLAN segment can communicate with each other.
>>>
>>
>> [ag] VMs on different segments can communicate using routing/IRB, so even
>> RFC 7348 is wrong.  Perhaps the text should be modified so say -- "In the
>> absence of a router in the overlay, a VM can communicate...".
>>
>>
>>>
>>> Section 3:
>>>> >>>
>>>>  Most deployments will have VMs with only L2 capabilities that
>>>> may not support L3.
>>>> >>>
>>>> Are you suggesting most deployments have VMs with no IP
>>>> addresses/configuration?
>>>>
>>> GIM>> Would re-word as follows:
>>> OLD TEXT:
>>>  Most deployments will have VMs with only L2 capabilities that
>>>  may not support L3.
>>> NEW TEXT:
>>> Deployments may have VMs with only L2 capabilities that do not support
>>> L3.
>>>
>>
>> [ag] I still don't understand this.  What does it mean for a VM to not
>> support L3?  No IP address, no default GW, something else?
>>
> GIM2>> VM communicates with its VTEP which, in turn, originates VXLAN
> tunnel. VM is not required to have IP address as it is VTEP's IP address
> that VM's MAC is associated with. As for gateway, RFC 7348 discusses VXLAN
> gateway as the device that forwards traffice between VXLAN and non-VXLAN
> domains. Considering all that, would the following change be acceptable:
> OLD TEXT:
>  Most deployments will have VMs with only L2 capabilities that
>  may not support L3.
> NEW TEXT:
>  Most deployments will have VMs with only L2 capabilities and not have an
> IP address assigned.
>

[ag2] Do you have a reference for this (i.e. that most deployments have VMs
without an IP address)?  Normally I would think VMs would have an IP
address.  It's just that they are segregated into segments and, without an
intervening router, they are restricted to communicate only within their
subnet.

>
>>
>>>
>>>> >>>
>>>> Having a hierarchical OAM model helps localize faults though it
>>>> requires additional consideration.
>>>> >>>
>>>> What are the additional considerations?
>>>>
>>> GIM>> For example, coordination of BFD intervals across the OAM layers.
>>>
>>>
>>
>> [ag] Can we mention them in the draft?
>>
>>
>>>
>>>> Would be useful to add a reference to RFC 8293 in case the reader would
>>>> like to know more about service nodes.
>>>>
>>> GIM>> I have to admit that I don't find how RFC 8293  A Framework for
>>> Multicast in Network Virtualization over Layer 3 is related to this
>>> document. Please help with additional reference to the text of the
>>> document.
>>>
>>
>> [ag] The RFC discusses the use of service nodes which is mentioned here.
>>
>>
>>>
>>>> Section 4
>>>> >>>
>>>> Separate BFD sessions can be established between the VTEPs (IP1 and
>>>> IP2) for monitoring each of the VXLAN tunnels (VNI 100 and 200).
>>>> >>>
>>>> IMO, the document should mention that this could lead to scaling issues
>>>> given that VTEPs can support well in excess of 4K VNIs.  Additionally, we
>>>> should mention that with IRB, a given VNI may not even exist on the
>>>> destination VTEP.  Finally, what is the benefit of doing this?  There may
>>>> be certain corner cases where it's useful (vs a single BFD session between
>>>> the VTEPs for all VNIs) but it would be good to explain what those are.
>>>>
>>> GIM>> Will add text in the Security Considerations section that VTEPs
>>> should have limit on number of BFD sessions.
>>>
>>
>> [ag] I was hoping for two things:
>> - A mention about the scalability issue right where per-VNI BFD is
>> discussed.  (Not sure why that is a security issue/consideration.)
>>
> GIM2>> I've added the following sentense in both places:
> The implementation SHOULD have a reasonable upper bound on the number of
> BFD sessions that can be created between the same pair of VTEPs.
>

[ag2] What is the criteria for determining what is reasonable?


> - What is the benefit of running BFD per VNI between a pair of VTEPs?
>>
> GIM2>> An alternative would be to run CFM between VMs, if there's the need
> to monitor liveliness of the particular VM. Again, this is optional.
>

[ag2] I'm not sure how running per-VNI BFD between the VTEPs allows one to
monitor the liveliness of VMs.


>
>>
>>>
>>>> Sections 5.1 and 6.1
>>>>
>>>> In 5.1 we have
>>>> >>>
>>>> The inner MAC frame carrying the BFD payload has the
>>>> following format:
>>>> ... Source IP: IP address of the originating VTEP. Destination IP: IP
>>>> address of the terminating VTEP.
>>>> >>>
>>>>
>>>> In 6.1 we have
>>>> >>>
>>>>
>>>> Since multiple BFD sessions may be running between two
>>>> VTEPs, there needs to be a mechanism for demultiplexing received BF
>>>>
>>>> packets to the proper session.  The procedure for demultiplexing
>>>> packets with Your Discriminator equal to 0 is different from[RFC5880 <https://tools.ietf.org/html/rfc5880>].
>>>>
>>>> *For such packets, the BFD session MUST be identified*
>>>>
>>>> *using the inner headers, i.e., the source IP and the destination IP
>>>> present in the IP header carried by the payload of the VXLAN*
>>>>
>>>> *encapsulated packet.*
>>>>
>>>>
>>>> >>>
>>>> How does this work if the source IP and dest IP are the same as
>>>> specified in 5.1?
>>>>
>>> GIM>> You're right, Destination and source IP addresses likely are the
>>> same in this case. Will add that the source UDP port number, along with the
>>> pair of IP addresses, MUST be used to demux received BFD control packets.
>>> Would you agree that will be sufficient?
>>>
>>
>> [ag] Yes, I think that should work.
>>
>>>
>>>> Editorial
>>>>
>>>
>> [ag] Agree with all comments on this section.
>>
>>>
>>>> - Terminology section should be renamed to acronyms.
>>>>
>>> GIM>> Accepted
>>>
>>>> - Document would benefit from a thorough editorial scrub, but maybe
>>>> that will happen once it gets to the RFC editor.
>>>>
>>> GIM>> Will certainly have helpful comments from ADs and RFC editor.
>>>
>>>>
>>>> Section 1
>>>> >>>
>>>> "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348
>>>> <https://tools.ietf.org/html/rfc7348>]. provides an encapsulation
>>>> scheme that allows virtual machines (VMs) to communicate in a data center
>>>> network.
>>>> >>>
>>>> This is not accurate.  VXLAN allows you to implement an overlay to
>>>> decouple the address space of the attached hosts from that of the network.
>>>>
>>> GIM>> Thank you for the suggested text. Will change as follows:
>>> OLD TEXT:
>>>    "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].  provides
>>>    an encapsulation scheme that allows virtual machines (VMs) to
>>>    communicate in a data center network.
>>> NEW TEXT:
>>>  "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348].  provides
>>>    an encapsulation scheme that allows building an overlay network by
>>>   decoupling the address space of the attached virtual hosts from that
>>> of the network.
>>>
>>>>
>>>> Section 7
>>>>
>>>> VTEP's -> VTEPs
>>>>
>>> GIM>> Yes, thank you.
>>>
>>

WGLC comments on draft-ietf-bfd-vxlan Anoop Ghanwani
Re: WGLC comments on draft-ietf-bfd-vxlan Greg Mirsky
Re: WGLC comments on draft-ietf-bfd-vxlan Greg Mirsky
Re: WGLC comments on draft-ietf-bfd-vxlan Greg Mirsky
Re: WGLC comments on draft-ietf-bfd-vxlan Anoop Ghanwani
Re: WGLC comments on draft-ietf-bfd-vxlan Greg Mirsky
Re: WGLC comments on draft-ietf-bfd-vxlan Anoop Ghanwani
Re: WGLC comments on draft-ietf-bfd-vxlan Greg Mirsky
Re: WGLC comments on draft-ietf-bfd-vxlan Anoop Ghanwani
Re: WGLC comments on draft-ietf-bfd-vxlan Greg Mirsky
Re: WGLC comments on draft-ietf-bfd-vxlan Anoop Ghanwani
Re: WGLC comments on draft-ietf-bfd-vxlan Greg Mirsky
Re: WGLC comments on draft-ietf-bfd-vxlan Anoop Ghanwani
Re: WGLC comments on draft-ietf-bfd-vxlan Greg Mirsky
Re: WGLC comments on draft-ietf-bfd-vxlan Anoop Ghanwani
Re: WGLC comments on draft-ietf-bfd-vxlan Greg Mirsky