< draft-ietf-bfd-vxlan-03.txt | draft-ietf-bfd-vxlan-04.txt > | |||
---|---|---|---|---|
Internet Engineering Task Force S. Pallagatti, Ed. | BFD S. Pallagatti, Ed. | |||
Internet-Draft Rtbrick | Internet-Draft Rtbrick | |||
Intended status: Standards Track S. Paragiri | Intended status: Standards Track S. Paragiri | |||
Expires: April 11, 2019 Juniper Networks | Expires: May 20, 2019 Juniper Networks | |||
V. Govindan | V. Govindan | |||
M. Mudigonda | M. Mudigonda | |||
Cisco | Cisco | |||
G. Mirsky | G. Mirsky | |||
ZTE Corp. | ZTE Corp. | |||
October 8, 2018 | November 16, 2018 | |||
BFD for VXLAN | BFD for VXLAN | |||
draft-ietf-bfd-vxlan-03 | draft-ietf-bfd-vxlan-04 | |||
Abstract | Abstract | |||
This document describes the use of the Bidirectional Forwarding | This document describes the use of the Bidirectional Forwarding | |||
Detection (BFD) protocol in Virtual eXtensible Local Area Network | Detection (BFD) protocol in Virtual eXtensible Local Area Network | |||
(VXLAN) overlay networks. | (VXLAN) overlay networks. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
skipping to change at page 1, line 38 ¶ | skipping to change at page 1, line 38 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on April 11, 2019. | This Internet-Draft will expire on May 20, 2019. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 15 ¶ | skipping to change at page 2, line 15 ¶ | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Conventions used in this document . . . . . . . . . . . . . . 3 | 2. Conventions used in this document . . . . . . . . . . . . . . 3 | |||
2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 | 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 | 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 | |||
3. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 3. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
4. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 4. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
5. BFD Packet Transmission over VXLAN Tunnel . . . . . . . . . . 5 | 5. BFD Packet Transmission over VXLAN Tunnel . . . . . . . . . . 5 | |||
5.1. BFD Packet Encapsulation in VXLAN . . . . . . . . . . . . 6 | 5.1. BFD Packet Encapsulation in VXLAN . . . . . . . . . . . . 6 | |||
6. Reception of BFD packet from VXLAN Tunnel . . . . . . . . . . 7 | 6. Reception of BFD packet from VXLAN Tunnel . . . . . . . . . . 7 | |||
6.1. Demultiplexing of the BFD packet . . . . . . . . . . . . 7 | 6.1. Demultiplexing of the BFD packet . . . . . . . . . . . . 7 | |||
7. Use of reserved VNI . . . . . . . . . . . . . . . . . . . . . 8 | 7. Use of reserved VNI . . . . . . . . . . . . . . . . . . . . . 8 | |||
8. Echo BFD . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 8. Echo BFD . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | |||
10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | 10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | |||
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8 | 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 | 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
13. Normative References . . . . . . . . . . . . . . . . . . . . 9 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 | 13.1. Normative References . . . . . . . . . . . . . . . . . . 9 | |||
13.2. Informational References . . . . . . . . . . . . . . . . 10 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 | ||||
1. Introduction | 1. Introduction | |||
"Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides | "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides | |||
an encapsulation scheme that allows virtual machines (VMs) to | an encapsulation scheme that allows building an overlay network by | |||
communicate in a data center network. | decoupling the address space of the attached virtual hosts from that | |||
of the network. | ||||
VXLAN is typically deployed in data centers interconnecting | VXLAN is typically deployed in data centers interconnecting | |||
virtualized hosts, which may be spread across multiple racks. The | virtualized hosts of a tenant. VXLAN addresses requirements of the | |||
individual racks may be part of a different Layer 3 network, or they | Layer 2 and Layer 3 data center network infrastructure in the | |||
could be in a single Layer 2 network. The VXLAN segments/overlays | presence of VMs in a multi-tenant environment, discussed in section 3 | |||
are overlaid on top of Layer 3 network. | [RFC7348], by providing Layer 2 overlay scheme on a Layer 3 network. | |||
A VM can communicate with another VM only if they are on the same | In the absence of a router in the overlay, a VM can communicate with | |||
VXLAN segment. VMs are unaware of VXLAN tunnels as a VXLAN tunnel is | another VM only if they are on the same VXLAN segment. VMs are | |||
terminated on a VXLAN Tunnel End Point (VTEP) (hypervisor/TOR). | unaware of VXLAN tunnels as a VXLAN tunnel is terminated on a VXLAN | |||
VTEPs (hypervisor/TOR) are responsible for encapsulating and | Tunnel End Point (VTEP) (hypervisor/TOR). VTEPs (hypervisor/TOR) are | |||
decapsulating frames exchanged among VMs. | responsible for encapsulating and decapsulating frames exchanged | |||
among VMs. | ||||
Ability to monitor path continuity, i.e., perform proactive | Ability to monitor path continuity, i.e., perform proactive | |||
continuity check (CC) for these tunnels, is important. The | continuity check (CC) for these tunnels, is important. The | |||
asynchronous mode of BFD, as defined in [RFC5880], can be used to | asynchronous mode of BFD, as defined in [RFC5880], can be used to | |||
monitor a VXLAN tunnel. Use of [I-D.ietf-bfd-multipoint] is for | monitor a VXLAN tunnel. Use of [I-D.ietf-bfd-multipoint] is for | |||
future study. | future study. | |||
Also, BFD in VXLAN can be used to monitor the particular service | Also, BFD in VXLAN can be used to monitor the particular service | |||
nodes that are designated to properly handle Layer 2 broadcast, | nodes that are designated to properly handle Layer 2 broadcast, | |||
unknown unicast, and multicast traffic. Such nodes, often referred | unknown unicast, and multicast traffic. Such nodes, discussed in | |||
"replicators", are usually virtual VTEPs and can be monitored by | details in [RFC8293], are often referred to as "replicators", are | |||
physical VTEPs to minimize BUM traffic directed to the unavailable | usually virtual VTEPs and can be monitored by physical VTEPs to | |||
replicator. | minimize BUM traffic directed to the unavailable replicator. | |||
This document describes the use of Bidirectional Forwarding Detection | This document describes the use of Bidirectional Forwarding Detection | |||
(BFD) protocol VXLAN to enable monitoring continuity of the path | (BFD) protocol VXLAN to enable monitoring continuity of the path | |||
between Network Virtualization Edges (NVEs) and/or availability of a | between Network Virtualization Edges (NVEs) and/or availability of a | |||
replicator service node using BFD. | replicator service node using BFD. | |||
In this document, the terms NVE and VTEP are used interchangeably. | In this document, the terms NVE and VTEP are used interchangeably. | |||
2. Conventions used in this document | 2. Conventions used in this document | |||
skipping to change at page 4, line 10 ¶ | skipping to change at page 4, line 16 ¶ | |||
The primary use case of BFD for VXLAN is for continuity check of a | The primary use case of BFD for VXLAN is for continuity check of a | |||
tunnel. By exchanging BFD control packets between VTEPs, an operator | tunnel. By exchanging BFD control packets between VTEPs, an operator | |||
exercises the VXLAN path in both the underlay and overlay thus | exercises the VXLAN path in both the underlay and overlay thus | |||
ensuring the VXLAN path availability and VTEPs reachability. BFD | ensuring the VXLAN path availability and VTEPs reachability. BFD | |||
failure detection can be used for maintenance. There are other use | failure detection can be used for maintenance. There are other use | |||
cases such as the following: | cases such as the following: | |||
Layer 2 VMs: | Layer 2 VMs: | |||
Most deployments will have VMs with only L2 capabilities that | Deployments might have VMs with only L2 capabilities and not | |||
may not support L3. BFD being an L3 protocol can be used as a | have an IP address assigned or, in other cases, VMs are | |||
assigned IP address but are restricted to communicate only | ||||
within their subnet. BFD being an L3 protocol can be used as a | ||||
tunnel CC mechanism, where BFD will start and terminate at the | tunnel CC mechanism, where BFD will start and terminate at the | |||
NVEs, e.g., VTEPs. | NVEs, e.g., VTEPs. | |||
It is possible to aggregate the CC sessions for multiple | It is possible to aggregate the CC sessions for multiple | |||
tenants by running a BFD session between the VTEPs over VxLAN | tenants by running a BFD session between the VTEPs over VxLAN | |||
tunnel. | tunnel. | |||
Fault localization: | Fault localization: | |||
It is also possible that VMs are L3 aware and can host a BFD | It is also possible that VMs are L3 aware and can host a BFD | |||
session. In these cases, BFD sessions can be established among | session. In these cases, BFD sessions can be established among | |||
VMs for CC. Also, BFD sessions can be created among VTEPs for | VMs for CC. Also, BFD sessions can be created among VTEPs for | |||
tunnel CC. Having a hierarchical OAM model helps localize | tunnel CC. Having a hierarchical OAM model helps localize | |||
faults though it requires additional consideration. | faults though it requires additional consideration of, for | |||
example, coordination of BFD intervals across the OAM layers | ||||
Service node reachability: | Service node reachability: | |||
The service node is responsible for sending BUM traffic. In | The service node is responsible for sending BUM traffic. In | |||
case a service node tunnel terminates at a VTEP, and that VTEP | case a service node tunnel terminates at a VTEP, and that VTEP | |||
might not even host VM. BFD session between TOR/hypervisor and | might not even host VM. BFD session between TOR/hypervisor and | |||
service node can be used to monitor service node reachability. | service node can be used to monitor service node reachability. | |||
4. Deployment | 4. Deployment | |||
Figure 1 illustrates the scenario with two servers, each of them | Figure 1 illustrates the scenario with two servers, each of them | |||
hosting two VMs. The servers host VTEPs that terminate two VXLAN | hosting two VMs. The servers host VTEPs that terminate two VXLAN | |||
tunnels with VNI number 100 and 200 respectively. Separate BFD | tunnels with VNI number 100 and 200 respectively. Separate BFD | |||
sessions can be established between the VTEPs (IP1 and IP2) for | sessions can be established between the VTEPs (IP1 and IP2) for | |||
monitoring each of the VXLAN tunnels (VNI 100 and 200). No BFD | monitoring each of the VXLAN tunnels (VNI 100 and 200). The | |||
packets intended for a Hypervisor VTEP should be forwarded to a VM as | implementation SHOULD have a reasonable upper bound on the number of | |||
a VM may drop BFD packets leading to a false negative. This method | BFD sessions that can be created between the same pair of VTEPs. No | |||
is applicable whether the VTEP is a virtual or physical device. | BFD packets intended for a Hypervisor VTEP should be forwarded to a | |||
VM as a VM may drop BFD packets leading to a false negative. This | ||||
method is applicable whether the VTEP is a virtual or physical | ||||
device. | ||||
+------------+-------------+ | +------------+-------------+ | |||
| Server 1 | | | Server 1 | | |||
| | | | | | |||
| +----+----+ +----+----+ | | | +----+----+ +----+----+ | | |||
| |VM1-1 | |VM1-2 | | | | |VM1-1 | |VM1-2 | | | |||
| |VNI 100 | |VNI 200 | | | | |VNI 100 | |VNI 200 | | | |||
| | | | | | | | | | | | | | |||
| +---------+ +---------+ | | | +---------+ +---------+ | | |||
| Hypervisor VTEP (IP1) | | | Hypervisor VTEP (IP1) | | |||
skipping to change at page 8, line 7 ¶ | skipping to change at page 8, line 7 ¶ | |||
associated with the VNI. | associated with the VNI. | |||
6.1. Demultiplexing of the BFD packet | 6.1. Demultiplexing of the BFD packet | |||
Demultiplexing of IP BFD packet has been defined in Section 3 of | Demultiplexing of IP BFD packet has been defined in Section 3 of | |||
[RFC5881]. Since multiple BFD sessions may be running between two | [RFC5881]. Since multiple BFD sessions may be running between two | |||
VTEPs, there needs to be a mechanism for demultiplexing received BFD | VTEPs, there needs to be a mechanism for demultiplexing received BFD | |||
packets to the proper session. The procedure for demultiplexing | packets to the proper session. The procedure for demultiplexing | |||
packets with Your Discriminator equal to 0 is different from | packets with Your Discriminator equal to 0 is different from | |||
[RFC5880]. For such packets, the BFD session MUST be identified | [RFC5880]. For such packets, the BFD session MUST be identified | |||
using the inner headers, i.e., the source IP and the destination IP | using the inner headers, i.e., the source IP, the destination IP, and | |||
present in the IP header carried by the payload of the VXLAN | the source UDP port number present in the IP header carried by the | |||
encapsulated packet. The VNI of the packet SHOULD be used to derive | payload of the VXLAN encapsulated packet. The VNI of the packet | |||
interface-related information for demultiplexing the packet. If BFD | SHOULD be used to derive interface-related information for | |||
packet is received with non-zero Your Discriminator, then BFD session | demultiplexing the packet. If BFD packet is received with non-zero | |||
MUST be demultiplexed only with Your Discriminator as the key. | Your Discriminator, then BFD session MUST be demultiplexed only with | |||
Your Discriminator as the key. | ||||
7. Use of reserved VNI | 7. Use of reserved VNI | |||
BFD session MAY be established for the reserved VNI 0. One way to | BFD session MAY be established for the reserved VNI 0. One way to | |||
aggregate BFD sessions between VTEP's is to establish a BFD session | aggregate BFD sessions between VTEPs is to establish a BFD session | |||
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session | with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session | |||
with a service node. | with a service node. | |||
8. Echo BFD | 8. Echo BFD | |||
Support for echo BFD is outside the scope of this document. | Support for echo BFD is outside the scope of this document. | |||
9. IANA Considerations | 9. IANA Considerations | |||
IANA has assigned TBA as a dedicated MAC address from the IANA 8-bit | IANA has assigned TBA as a dedicated MAC address from the IANA 8-bit | |||
skipping to change at page 8, line 40 ¶ | skipping to change at page 8, line 41 ¶ | |||
packets. | packets. | |||
10. Security Considerations | 10. Security Considerations | |||
The document requires setting the inner IP TTL to 1 which could be | The document requires setting the inner IP TTL to 1 which could be | |||
used as a DDoS attack vector. Thus the implementation MUST have | used as a DDoS attack vector. Thus the implementation MUST have | |||
throttling in place to control the rate of BFD control packets sent | throttling in place to control the rate of BFD control packets sent | |||
to the control plane. Throttling MAY be relaxed for BFD packets | to the control plane. Throttling MAY be relaxed for BFD packets | |||
based on port number. | based on port number. | |||
Other than inner IP TTL set to 1 this specification does not raise | The implementation SHOULD have a reasonable upper bound on the number | |||
any additional security issues beyond those of the specifications | of BFD sessions that can be created between the same pair of VTEPs. | |||
Other than inner IP TTL set to 1 and limit the number of BFD sessions | ||||
between the same pair of VTEPs, this specification does not raise any | ||||
additional security issues beyond those of the specifications | ||||
referred to in the list of normative references. | referred to in the list of normative references. | |||
11. Contributors | 11. Contributors | |||
Reshad Rahman | Reshad Rahman | |||
rrahman@cisco.com | rrahman@cisco.com | |||
Cisco | Cisco | |||
12. Acknowledgments | 12. Acknowledgments | |||
Authors would like to thank Jeff Hass of Juniper Networks for his | Authors would like to thank Jeff Hass of Juniper Networks for his | |||
reviews and feedback on this material. | reviews and feedback on this material. | |||
Authors would also like to thank Nobo Akiya, Marc Binderberger, | Authors would also like to thank Nobo Akiya, Marc Binderberger, | |||
Shahram Davari and Donald E. Eastlake 3rd for the extensive reviews | Shahram Davari, Donald E. Eastlake 3rd, and Anoop Ghanwani for the | |||
and the most detailed and helpful comments. | extensive reviews and the most detailed and helpful comments. | |||
13. Normative References | 13. References | |||
13.1. Normative References | ||||
[I-D.ietf-bfd-multipoint] | [I-D.ietf-bfd-multipoint] | |||
Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for | Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for | |||
Multipoint Networks", draft-ietf-bfd-multipoint-18 (work | Multipoint Networks", draft-ietf-bfd-multipoint-18 (work | |||
in progress), June 2018. | in progress), June 2018. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
skipping to change at page 9, line 46 ¶ | skipping to change at page 10, line 9 ¶ | |||
L., Sridhar, T., Bursell, M., and C. Wright, "Virtual | L., Sridhar, T., Bursell, M., and C. Wright, "Virtual | |||
eXtensible Local Area Network (VXLAN): A Framework for | eXtensible Local Area Network (VXLAN): A Framework for | |||
Overlaying Virtualized Layer 2 Networks over Layer 3 | Overlaying Virtualized Layer 2 Networks over Layer 3 | |||
Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, | Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, | |||
<https://www.rfc-editor.org/info/rfc7348>. | <https://www.rfc-editor.org/info/rfc7348>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
13.2. Informational References | ||||
[RFC8293] Ghanwani, A., Dunbar, L., McBride, M., Bannai, V., and R. | ||||
Krishnan, "A Framework for Multicast in Network | ||||
Virtualization over Layer 3", RFC 8293, | ||||
DOI 10.17487/RFC8293, January 2018, | ||||
<https://www.rfc-editor.org/info/rfc8293>. | ||||
Authors' Addresses | Authors' Addresses | |||
Santosh Pallagatti (editor) | Santosh Pallagatti (editor) | |||
Rtbrick | Rtbrick | |||
Email: santosh.pallagatti@gmail.com | Email: santosh.pallagatti@gmail.com | |||
Sudarsan Paragiri | Sudarsan Paragiri | |||
Juniper Networks | Juniper Networks | |||
1194 N. Mathilda Ave. | 1194 N. Mathilda Ave. | |||
Sunnyvale, California 94089-1206 | Sunnyvale, California 94089-1206 | |||
USA | USA | |||
Email: sparagiri@juniper.net | Email: sparagiri@juniper.net | |||
Vengada Prasad Govindan | Vengada Prasad Govindan | |||
Cisco | Cisco | |||
End of changes. 22 change blocks. | ||||
44 lines changed or deleted | 70 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |