< draft-ietf-bfd-vxlan-01.txt | draft-ietf-bfd-vxlan-02.txt > | |||
---|---|---|---|---|
Internet Engineering Task Force S. Pallagatti, Ed. | Internet Engineering Task Force S. Pallagatti, Ed. | |||
Internet-Draft Rtbrick | Internet-Draft Rtbrick | |||
Intended status: Standards Track S. Paragiri | Intended status: Standards Track S. Paragiri | |||
Expires: February 7, 2019 Juniper Networks | Expires: February 18, 2019 Juniper Networks | |||
V. Govindan | V. Govindan | |||
M. Mudigonda | M. Mudigonda | |||
Cisco | Cisco | |||
G. Mirsky | G. Mirsky | |||
ZTE Corp. | ZTE Corp. | |||
August 6, 2018 | August 17, 2018 | |||
BFD for VXLAN | BFD for VXLAN | |||
draft-ietf-bfd-vxlan-01 | draft-ietf-bfd-vxlan-02 | |||
Abstract | Abstract | |||
This document describes the use of Bidirectional Forwarding Detection | This document describes the use of the Bidirectional Forwarding | |||
(BFD) protocol in Virtual eXtensible Local Area Network (VXLAN) | Detection (BFD) protocol in Virtual eXtensible Local Area Network | |||
overlay network. | (VXLAN) overlay networks. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on February 7, 2019. | This Internet-Draft will expire on February 18, 2019. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 32 ¶ | skipping to change at page 2, line 32 ¶ | |||
8. Echo BFD . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 8. Echo BFD . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | |||
10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | 10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | |||
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8 | 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 | 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
13. Normative References . . . . . . . . . . . . . . . . . . . . 9 | 13. Normative References . . . . . . . . . . . . . . . . . . . . 9 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
1. Introduction | 1. Introduction | |||
"Virtual eXtensible Local Area Network (VXLAN)" has been described in | "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides | |||
[RFC7348]. VXLAN provides an encapsulation scheme that allows | an encapsulation scheme that allows virtual machines (VMs) to | |||
virtual machines (VMs) to communicate in a data center network. | communicate in a data center network. | |||
VXLAN is typically deployed in data centers interconnecting | VXLAN is typically deployed in data centers interconnecting | |||
virtualized hosts, which may be spread across multiple racks. The | virtualized hosts, which may be spread across multiple racks. The | |||
individual racks may be part of a different Layer 3 network, or they | individual racks may be part of a different Layer 3 network, or they | |||
could be in a single Layer 2 network. The VXLAN segments/overlay | could be in a single Layer 2 network. The VXLAN segments/overlays | |||
networks are overlaid on top of these Layer 2 or Layer 3 networks. | are overlaid on top of Layer 3 network. | |||
A VM can communicate with another VM only if they are on the same | A VM can communicate with another VM only if they are on the same | |||
VXLAN. VMs are unaware of VXLAN tunnels as VXLAN tunnel is | VXLAN segment. VMs are unaware of VXLAN tunnels as a VXLAN tunnel is | |||
terminated on VXLAN Tunnel End Point (VTEP) (hypervisor/TOR). VTEPs | terminated on a VXLAN Tunnel End Point (VTEP) (hypervisor/TOR). | |||
(hypervisor/TOR) are responsible for encapsulating, and decapsulating | VTEPs (hypervisor/TOR) are responsible for encapsulating and | |||
frames exchanged among VMs. | decapsulating frames exchanged among VMs. | |||
Since underlay is an L3 network, ability to monitor path continuity, | Ability to monitor path continuity, i.e., perform proactive | |||
i.e., perform proactive continuity check (CC) for these tunnels is | continuity check (CC) for these tunnels, is important. The | |||
important. Asynchronous mode of BFD, as defined in [RFC5880], can be | asynchronous mode of BFD, as defined in [RFC5880], can be used to | |||
used to monitor a VXLAN tunnel. Use of [I-D.ietf-bfd-multipoint] is | monitor a VXLAN tunnel. Use of [I-D.ietf-bfd-multipoint] is for | |||
for future study. | future study. | |||
Also, BFD in VXLAN can be used to monitor the particular service | Also, BFD in VXLAN can be used to monitor the particular service | |||
nodes that are designated to properly handle Layer 2 broadcast, | nodes that are designated to properly handle Layer 2 broadcast, | |||
unknown unicast, and multicast traffic. Such nodes, often referred | unknown unicast, and multicast traffic. Such nodes, often referred | |||
"replicators", are usually virtual VTEPs can be monitored by physical | "replicators", are usually virtual VTEPs and can be monitored by | |||
VTEPs to minimize BUM traffic directed to the unavailable replicator. | physical VTEPs to minimize BUM traffic directed to the unavailable | |||
replicator. | ||||
This document describes the use of Bidirectional Forwarding Detection | This document describes the use of Bidirectional Forwarding Detection | |||
(BFD) protocol VXLAN to enable continuity monitoring between Network | (BFD) protocol VXLAN to enable monitoring continuity of the path | |||
Virtualization Edges (NVEs) and/or availability of a replicator | between Network Virtualization Edges (NVEs) and/or availability of a | |||
service node using BFD. | replicator service node using BFD. | |||
In this document, the terms NVE and VTEP are used interchangeably. | ||||
2. Conventions used in this document | 2. Conventions used in this document | |||
2.1. Terminology | 2.1. Terminology | |||
BFD - Bidirectional Forwarding Detection | BFD - Bidirectional Forwarding Detection | |||
CC - Continuity Check | CC - Continuity Check | |||
NVE - Network Virtualization Edge | NVE - Network Virtualization Edge | |||
skipping to change at page 3, line 48 ¶ | skipping to change at page 3, line 51 ¶ | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown here. | capitals, as shown here. | |||
3. Use cases | 3. Use cases | |||
The primary use case of BFD for VXLAN is for continuity check of a | The primary use case of BFD for VXLAN is for continuity check of a | |||
tunnel. By exchanging BFD control packets between VTEPs, an operator | tunnel. By exchanging BFD control packets between VTEPs, an operator | |||
exercises the VXLAN path in both in underlay and overlay thus | exercises the VXLAN path in both the underlay and overlay thus | |||
ensuring the VXLAN path availability and VTEPs reachability. BFD | ensuring the VXLAN path availability and VTEPs reachability. BFD | |||
failure detection can be used for maintenance. There are other use | failure detection can be used for maintenance. There are other use | |||
cases such as | cases such as the following: | |||
Layer 2 VMs: | Layer 2 VMs: | |||
Most deployments will have VMs with only L2 capabilities that | Most deployments will have VMs with only L2 capabilities that | |||
may not support L3. BFD being an L3 protocol can be used as | may not support L3. BFD being an L3 protocol can be used as a | |||
tunnel CC mechanism, where BFD will start and terminate at the | tunnel CC mechanism, where BFD will start and terminate at the | |||
NVEs, e.g., VTEPs. | NVEs, e.g., VTEPs. | |||
It is possible to aggregate the CC sessions for multiple | It is possible to aggregate the CC sessions for multiple | |||
tenants by running a BFD session between the VTEPs over VxLAN | tenants by running a BFD session between the VTEPs over VxLAN | |||
tunnel. In the rest of this document, terms NVE and VTEP are | tunnel. | |||
used interchangeably. | ||||
Fault localization: | Fault localization: | |||
It is also possible that VMs are L3 aware and can host a BFD | It is also possible that VMs are L3 aware and can host a BFD | |||
session. In these cases, BFD sessions can be established among | session. In these cases, BFD sessions can be established among | |||
VMs for CC. In addition, BFD sessions can be established among | VMs for CC. Also, BFD sessions can be created among VTEPs for | |||
VTEPs for tunnel CC. Having a hierarchical OAM model helps | tunnel CC. Having a hierarchical OAM model helps localize | |||
localize faults though requires additional consideration. | faults though it requires additional consideration. | |||
Service node reachability: | Service node reachability: | |||
The service node is responsible for sending BUM traffic. In | The service node is responsible for sending BUM traffic. In | |||
case a service node tunnel terminates at VTEP, and it might not | case a service node tunnel terminates at a VTEP, and that VTEP | |||
even host VM. BFD session between TOR/hypervisor and service | might not even host VM. BFD session between TOR/hypervisor and | |||
node can be used to monitor service node reachability. | service node can be used to monitor service node reachability. | |||
4. Deployment | 4. Deployment | |||
Figure 1 illustrates the scenario with two servers, each of them | Figure 1 illustrates the scenario with two servers, each of them | |||
hosting two VMs. The servers host VTEPs that terminate two VXLAN | hosting two VMs. The servers host VTEPs that terminate two VXLAN | |||
tunnels with VNI number 100 and 200 respectively. Separate BFD | tunnels with VNI number 100 and 200 respectively. Separate BFD | |||
sessions can be established between the VTEPs (IP1 and IP2) for | sessions can be established between the VTEPs (IP1 and IP2) for | |||
monitoring each of the VXLAN tunnels (VNI 100 and 200). No BFD | monitoring each of the VXLAN tunnels (VNI 100 and 200). No BFD | |||
packets intended to Hypervisor VTEP should be forwarded to a VM as VM | packets intended for a Hypervisor VTEP should be forwarded to a VM as | |||
may drop BFD packets leading to a false negative. This method is | a VM may drop BFD packets leading to a false negative. This method | |||
applicable whether VTEP is a virtual or physical device. | is applicable whether the VTEP is a virtual or physical device. | |||
+------------+-------------+ | +------------+-------------+ | |||
| Server 1 | | | Server 1 | | |||
| | | | | | |||
| +----+----+ +----+----+ | | | +----+----+ +----+----+ | | |||
| |VM1-1 | |VM1-2 | | | | |VM1-1 | |VM1-2 | | | |||
| |VNI 100 | |VNI 200 | | | | |VNI 100 | |VNI 200 | | | |||
| | | | | | | | | | | | | | |||
| +---------+ +---------+ | | | +---------+ +---------+ | | |||
| Hypervisor VTEP (IP1) | | | Hypervisor VTEP (IP1) | | |||
skipping to change at page 5, line 44 ¶ | skipping to change at page 5, line 44 ¶ | |||
| +---------+ +---------+ | | | +---------+ +---------+ | | |||
| Server 2 | | | Server 2 | | |||
+--------------------------+ | +--------------------------+ | |||
Figure 1: Reference VXLAN domain | Figure 1: Reference VXLAN domain | |||
5. BFD Packet Transmission over VXLAN Tunnel | 5. BFD Packet Transmission over VXLAN Tunnel | |||
BFD packet MUST be encapsulated and sent to a remote VTEP as | BFD packet MUST be encapsulated and sent to a remote VTEP as | |||
explained in Section 5.1. Implementations SHOULD ensure that the BFD | explained in Section 5.1. Implementations SHOULD ensure that the BFD | |||
packets follow the same lookup path of VXLAN packets within the | packets follow the same lookup path as VXLAN data packets within the | |||
sender system. | sender system. | |||
5.1. BFD Packet Encapsulation in VXLAN | 5.1. BFD Packet Encapsulation in VXLAN | |||
VXLAN packet format has been described in Section 5 of [RFC7348]. | BFD packets are encapsulated in VXLAN as described below. The VXLAN | |||
The Outer IP/UDP and VXLAN headers MUST be encoded by the sender as | packet format is defined in Section 5 of [RFC7348]. The Outer IP/UDP | |||
defined in [RFC7348]. | and VXLAN headers MUST be encoded by the sender as defined in | |||
[RFC7348]. | ||||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | | | | | |||
~ Outer Ethernet Header ~ | ~ Outer Ethernet Header ~ | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | | | | | |||
~ Outer IPvX Header ~ | ~ Outer IPvX Header ~ | |||
skipping to change at page 6, line 49 ¶ | skipping to change at page 6, line 50 ¶ | |||
~ Inner UDP Header ~ | ~ Inner UDP Header ~ | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | | | | | |||
~ BFD Control Message ~ | ~ BFD Control Message ~ | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| FCS | | | FCS | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
Figure 2: VXLAN Encapsulaion of BFD Control Message | Figure 2: VXLAN Encapsulation of BFD Control Message | |||
The BFD packet MUST be carried inside the inner MAC frame of the | The BFD packet MUST be carried inside the inner MAC frame of the | |||
VXLAN packet. The inner MAC frame carrying the BFD payload has the | VXLAN packet. The inner MAC frame carrying the BFD payload has the | |||
following format: | following format: | |||
Ethernet Header: | Ethernet Header: | |||
Destination MAC: This MUST be a dedicated MAC (TBA) Section 9 | Destination MAC: This MUST be the dedicated MAC TBA (Section 9) | |||
or the MAC address of the destination VTEP. The details of how | or the MAC address of the destination VTEP. The details of how | |||
the MAC address of the destination VTEP is obtained are outside | the MAC address of the destination VTEP is obtained are outside | |||
the scope of this document. | the scope of this document. | |||
Source MAC: MAC address of the originating VTEP | Source MAC: MAC address of the originating VTEP | |||
IP header: | IP header: | |||
Source IP: IP address of the originating VTEP. | Source IP: IP address of the originating VTEP. | |||
skipping to change at page 7, line 45 ¶ | skipping to change at page 7, line 45 ¶ | |||
packet MUST be processed further. | packet MUST be processed further. | |||
The UDP destination port and the TTL of the inner Ethernet frame MUST | The UDP destination port and the TTL of the inner Ethernet frame MUST | |||
be validated to determine if the received packet can be processed by | be validated to determine if the received packet can be processed by | |||
BFD. BFD packet with inner MAC set to VTEP or dedicated MAC address | BFD. BFD packet with inner MAC set to VTEP or dedicated MAC address | |||
MUST NOT be forwarded to VMs. | MUST NOT be forwarded to VMs. | |||
To ensure BFD detects the proper configuration of VXLAN Network | To ensure BFD detects the proper configuration of VXLAN Network | |||
Identifier (VNI) in a remote VTEP, a lookup SHOULD be performed with | Identifier (VNI) in a remote VTEP, a lookup SHOULD be performed with | |||
the MAC-DA and VNI as key in the Virtual Forwarding Instance (VFI) | the MAC-DA and VNI as key in the Virtual Forwarding Instance (VFI) | |||
table of the originating/ terminating VTEP to exercise the VFI | table of the originating/terminating VTEP to exercise the VFI | |||
associated with the VNI. | associated with the VNI. | |||
6.1. Demultiplexing of the BFD packet | 6.1. Demultiplexing of the BFD packet | |||
Demultiplexing of IP BFD packet has been defined in Section 3 of | Demultiplexing of IP BFD packet has been defined in Section 3 of | |||
[RFC5881]. Since multiple BFD sessions may be running between two | [RFC5881]. Since multiple BFD sessions may be running between two | |||
VTEPs, there needs to be a mechanism for demultiplexing received BFD | VTEPs, there needs to be a mechanism for demultiplexing received BFD | |||
packets to the proper session. The procedure for demultiplexing | packets to the proper session. The procedure for demultiplexing | |||
packets with Your Discriminator equal to 0 is different from | packets with Your Discriminator equal to 0 is different from | |||
[RFC5880]. For such packets, the BFD session MUST be identified | [RFC5880]. For such packets, the BFD session MUST be identified | |||
skipping to change at page 8, line 27 ¶ | skipping to change at page 8, line 27 ¶ | |||
aggregate BFD sessions between VTEP's is to establish a BFD session | aggregate BFD sessions between VTEP's is to establish a BFD session | |||
with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session | with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session | |||
with a service node. | with a service node. | |||
8. Echo BFD | 8. Echo BFD | |||
Support for echo BFD is outside the scope of this document. | Support for echo BFD is outside the scope of this document. | |||
9. IANA Considerations | 9. IANA Considerations | |||
IANA is requested to assign a dedicated MAC address to be used as the | IANA has assigned TBA as a dedicated MAC address to be used as the | |||
Destination MAC address of the inner Ethernet which carries BFD | Destination MAC address of the inner Ethernet of VXLAN when carrying | |||
control packet in IP/UDP encapsulation. | BFD control packets. | |||
10. Security Considerations | 10. Security Considerations | |||
The document recommends setting the inner IP TTL to 1 which could | The document recommends setting the inner IP TTL to 1 which could | |||
lead to a DDoS attack. Thus the implementation MUST have throttling | lead to a DDoS attack. Thus the implementation MUST have throttling | |||
in place. Throttling MAY be relaxed for BFD packets based on port | in place. Throttling MAY be relaxed for BFD packets based on port | |||
number. | number. | |||
Other than inner IP TTL set to 1 this specification does not raise | Other than inner IP TTL set to 1 this specification does not raise | |||
any additional security issues beyond those of the specifications | any additional security issues beyond those of the specifications | |||
skipping to change at page 9, line 10 ¶ | skipping to change at page 9, line 10 ¶ | |||
Reshad Rahman | Reshad Rahman | |||
rrahman@cisco.com | rrahman@cisco.com | |||
Cisco | Cisco | |||
12. Acknowledgments | 12. Acknowledgments | |||
Authors would like to thank Jeff Hass of Juniper Networks for his | Authors would like to thank Jeff Hass of Juniper Networks for his | |||
reviews and feedback on this material. | reviews and feedback on this material. | |||
Authors would also like to thank Nobo Akiya, Marc Binderberger and | Authors would also like to thank Nobo Akiya, Marc Binderberger, | |||
Shahram Davari for the extensive review. | Shahram Davari and Donald E. Eastlake 3rd for the extensive reviews | |||
and the most detailed and helpful comments. | ||||
13. Normative References | 13. Normative References | |||
[I-D.ietf-bfd-multipoint] | [I-D.ietf-bfd-multipoint] | |||
Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for | Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for | |||
Multipoint Networks", draft-ietf-bfd-multipoint-18 (work | Multipoint Networks", draft-ietf-bfd-multipoint-18 (work | |||
in progress), June 2018. | in progress), June 2018. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
End of changes. 25 change blocks. | ||||
52 lines changed or deleted | 57 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |