< draft-ietf-bier-path-mtu-discovery-11.txt   draft-ietf-bier-path-mtu-discovery-12.txt >
BIER Working Group G. Mirsky BIER Working Group G. Mirsky
Internet-Draft Ericsson Internet-Draft Ericsson
Intended status: Standards Track T. Przygienda Intended status: Standards Track T. Przygienda
Expires: 7 April 2022 Juniper Networks Expires: 12 April 2022 Juniper Networks
A. Dolganow A. Dolganow
Individual contributor Individual contributor
4 October 2021 9 October 2021
Path Maximum Transmission Unit Discovery (PMTUD) for Bit Index Explicit Path Maximum Transmission Unit Discovery (PMTUD) for Bit Index Explicit
Replication (BIER) Layer Replication (BIER) Layer
draft-ietf-bier-path-mtu-discovery-11 draft-ietf-bier-path-mtu-discovery-12
Abstract Abstract
This document describes Path Maximum Transmission Unit Discovery This document describes Path Maximum Transmission Unit Discovery
(PMTUD) in Bit Indexed Explicit Replication (BIER) layer. (PMTUD) in Bit Indexed Explicit Replication (BIER) layer.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
skipping to change at page 1, line 35 skipping to change at page 1, line 35
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 7 April 2022. This Internet-Draft will expire on 12 April 2022.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text extracted from this document must include Simplified BSD License text
as described in Section 4.e of the Trust Legal Provisions and are as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License. provided without warranty as described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Conventions used in this document . . . . . . . . . . . . 3 1.1. Conventions used in this document . . . . . . . . . . . . 2
1.1.1. Acronyms . . . . . . . . . . . . . . . . . . . . . . 3 1.1.1. Terminology . . . . . . . . . . . . . . . . . . . . . 2
1.1.2. Requirements Language . . . . . . . . . . . . . . . . 3 1.1.2. Requirements Language . . . . . . . . . . . . . . . . 3
2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 3 2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 3
3. PMTUD Mechanism for BIER . . . . . . . . . . . . . . . . . . 4 3. PMTUD Mechanism for BIER . . . . . . . . . . . . . . . . . . 4
3.1. Data TLV for BIER Ping . . . . . . . . . . . . . . . . . 6 3.1. Data TLV for BIER Ping . . . . . . . . . . . . . . . . . 6
4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6
5. Security Considerations . . . . . . . . . . . . . . . . . . . 7 5. Security Considerations . . . . . . . . . . . . . . . . . . . 7
6. Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . 7 6. Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . 7
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 7
7.1. Normative References . . . . . . . . . . . . . . . . . . 7 7.1. Normative References . . . . . . . . . . . . . . . . . . 7
7.2. Informative References . . . . . . . . . . . . . . . . . 8 7.2. Informative References . . . . . . . . . . . . . . . . . 7
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8
1. Introduction 1. Introduction
In packet switched networks, when a host seeks to transmit data to a In packet switched networks, when a host seeks to transmit data to a
target destination, the data is transmitted as a set of packets. In target destination, the data is transmitted as a set of packets. In
many cases, it is more efficient to use the largest size packets that many cases, it is more efficient to use the largest size packets that
are less than or equal to the least Maximum Transmission Unit (MTU) are less than or equal to the least Maximum Transmission Unit (MTU)
for any forwarding device along the routed path to the IP destination for any forwarding device along the routed path to the IP destination
for these packets. Such "least MTU" is known as Path MTU (PMTU). for these packets. Such "least MTU" is known as Path MTU (PMTU).
Fragmentation or packet drop, silent or not, may occur on hops along Fragmentation or packet drop, silent or not, may occur on hops along
the route where an MTU is smaller than the size of the datagram. To the route where an MTU is smaller than the size of the datagram. To
avoid any of the listed above behaviors, the packet source must find avoid any of the listed above behaviors, the packet source must find
the value of the least MTU, i.e., PMTU, that will be encountered the value of the least MTU, i.e., PMTU, that will be encountered
along the route that a set of packets will follow to reach the given along the route that a set of packets will follow to reach the given
set of destinations. Such MTU determination along a specific path is set of destinations. Such MTU determination along a specific path is
referred to as path MTU discovery (PMTUD). referred to as path MTU discovery (PMTUD).
[RFC8279] introduces and explains Bit Index Explicit Replication [RFC8279] introduces and explains Bit Index Explicit Replication
(BIER) architecture and how it supports the forwarding of multicast (BIER) architecture and how it supports the forwarding of multicast
data packets. A BIER domain consists of Bit-Forwarding Routers data packets. [I-D.ietf-bier-ping] introduced BIER Ping as a
(BFRs) that are uniquely identified by their respective BFR-ids. An transport-independent OAM mechanism to detect and localize failures
ingress border router (acting as a Bit Forwarding Ingress Router in the BIER data plane. This document specifies how BIER Ping can be
(BFIR)) inserts a Forwarding Bit Mask (F-BM) into a packet. Each used to perform efficient PMTUD in the BIER domain.
targeted egress node (referred to as a Bit Forwarding Egress Router
(BFER)) is represented by Bit Mask Position (BMP) in the BMS. A
transit or intermediate BIER node, referred to as BFR, forwards BIER
encapsulated packets to BFERs, identified by respective BMPs,
according to a Bit Index Forwarding Table (BIFT).
1.1. Conventions used in this document 1.1. Conventions used in this document
1.1.1. Acronyms 1.1.1. Terminology
BFR: Bit-Forwarding Router
BFER: Bit-Forwarding Egress Router
BFIR: Bit-Forwarding Ingress Router
BIER: Bit Index Explicit Replication
BIFT: Bit Index Forwarding Tree
F-BM: Forwarding Bit Mask
MTU: Maximum Transmission Unit
OAM: Operations, Administration and Maintenance
PMTUD: Path MTU Discovery This document uses terminology defined in [RFC8279]. Familiarity
with this specification and the terminology used is expected.
1.1.2. Requirements Language 1.1.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
2. Problem Statement 2. Problem Statement
skipping to change at page 4, line 5 skipping to change at page 3, line 30
primarily targeted to work on point-to-point, i.e. unicast paths. primarily targeted to work on point-to-point, i.e. unicast paths.
These mechanisms use packet fragmentation control by disabling These mechanisms use packet fragmentation control by disabling
fragmentation of the probe packet. As a result, a transient node fragmentation of the probe packet. As a result, a transient node
that cannot forward a probe packet that is bigger than its link MTU that cannot forward a probe packet that is bigger than its link MTU
sends to the packet source an error notification, otherwise the sends to the packet source an error notification, otherwise the
packet destination may respond with a positive acknowledgment. Thus, packet destination may respond with a positive acknowledgment. Thus,
possibly through a series of iterations, varying the size of the possibly through a series of iterations, varying the size of the
probe packet, the packet source discovers the PMTU of the particular probe packet, the packet source discovers the PMTU of the particular
path. path.
Thus applied such existing PMTUD solutions are inefficient for point- Applying such existing PMTUD solutions are inefficient for point-to-
to-multipoint paths constructed for multicast traffic. Probe packets multipoint paths constructed for multicast traffic. Probe packets
must be flooded through the whole set of multicast distribution paths must be flooded through the whole set of multicast distribution paths
over and over again until the very last egress responds with a over and over again until the very last egress responds with a
positive acknowledgment. Consider without loss of generality an positive acknowledgment. Consider the multicast network presented in
example multicast network presented in Figure 1, where MTU on all Figure 1, where MTU on all links but one (B, D) is the same. If MTU
links but one (B, D) is the same. If MTU on the link (B, D) is on the link (B, D) is smaller than the MTU on the other links, using
smaller than the MTU on the other links, using existing PMTUD existing PMTUD mechanism probes will unnecessarily flood to leaf
mechanism probes will unnecessary flood to leaf nodes E, F, and G for nodes E, F, and G for the second and consecutive times and positive
the second and consecutive times and positive responses will be responses will be generated and received by root A repeatedly.
generated and received by root A repeatedly.
----- -----
--| D | --| D |
----- / ----- ----- / -----
--| B |-- --| B |--
/ ----- \ ----- / ----- \ -----
/ --| E | / --| E |
----- / ----- ----- / -----
| A |--- ----- | A |--- -----
----- \ --| F | ----- \ --| F |
skipping to change at page 5, line 5 skipping to change at page 4, line 40
to forward towards the subset of targeted downstream BFERs, the BFR to forward towards the subset of targeted downstream BFERs, the BFR
responds with a partial (compared to the one it received in the responds with a partial (compared to the one it received in the
request) bitmask towards the originating BFIR in error notification. request) bitmask towards the originating BFIR in error notification.
That allows for retransmission of the next probe with a smaller MTU That allows for retransmission of the next probe with a smaller MTU
address only towards the failed downstream BFERs instead of all BFERs address only towards the failed downstream BFERs instead of all BFERs
addressed in the previous probe. In the scenario discussed in addressed in the previous probe. In the scenario discussed in
Section 2 the second and all following (if needed) probes will be Section 2 the second and all following (if needed) probes will be
sent only to the node D since MTU discovery of E, F, and G has been sent only to the node D since MTU discovery of E, F, and G has been
completed already by the first probe successfully. completed already by the first probe successfully.
[I-D.ietf-bier-ping] introduced BIER Ping as a transport-independent
OAM mechanism to detect and localize failures in the BIER data plane.
This document specifies how BIER Ping can be used to perform
efficient PMTUD in the BIER domain.
Consider the network displayed in Figure 1 to be a presentation of a Consider the network displayed in Figure 1 to be a presentation of a
BIER domain and all nodes to be BFRs. To discover MTU over BIER BIER domain and all nodes to be BFRs. To discover MTU over BIER
domain to BFERs D, F, E, and G BFIR A will use BIER Ping with Data domain to BFERs D, F, E, and G BFIR A will use BIER Ping with Data
TLV, defined in Section 3.1. Size of the first probe set to M_max TLV, defined in Section 3.1. Size of the first probe set to M_max
determined as minimal MTU value of BFIR's links to BIER domain. As determined as minimal MTU value of BFIR's links to BIER domain. As
has been assumed in Section 2, MTUs of all links but the link (B, D) has been assumed in Section 2, MTUs of all links but the link (B, D)
are the same. Thus BFERs E, F, and G would receive BIER Echo Request are the same. Thus BFERs E, F, and G would receive BIER Echo Request
and will send their respective replies to BFIR A. BFR B may pass the and will send their respective replies to BFIR A. BFR B may pass the
packet which is too large to forward over egress link (B, D) to the packet which is too large to forward over egress link (B, D) to the
appropriate network layer for error processing where it would be appropriate network layer for error processing where it would be
recognized as a BIER Echo Request packet. BFR B MUST send BIER Echo recognized as a BIER Echo Request packet. BFR B MUST send BIER Echo
Reply to BFIR A and MUST include Downstream Mapping TLV, defined in Reply to BFIR A and MUST include Downstream Mapping TLV, defined in
[I-D.ietf-bier-ping] setting its fields in the following fashion: [I-D.ietf-bier-ping] setting its fields in the following fashion:
* MTU SHOULD be set to the minimal MTU value among all egress BIER * MTU SHOULD be set to the minimal MTU value among all egress BIER
links, logical links between this and downstream BFRs, that could links, logical links between this and downstream BFRs, that could
be used to reach B's downstream BFERs; be used to reach B's downstream BFERs;
* Address Type MUST be set to 0 [Ed.note: we need to define 0 as * Address Type MAY be set to any value defined in Section 3.3.4
valid value for the Address Type field with the specific semantics [I-D.ietf-bier-ping].
to "Ignore" it.]
* I flag MUST be cleared; * I flag MUST be cleared to direct the responding BFR not to include
the Incoming SI-BitString TLV in the BIER Echo Response.
* Downstream Interface Address field (4 octets) MUST be zeroed and * Downstream Interface Address field MUST be zeroed.
MUST include in the Egress Bitstring sub-TLV the list of all BFERs
that cannot be reached because the attempted MTU turned out to be * List of Sub-TLVs MUST include the Egress Bitstring sub-TLV with
too small. the list of all BFERs that cannot be reached because the egress
MTU turned out to be too small.
The BFIR will receive either of the two types of packets: The BFIR will receive either of the two types of packets:
* a positive Echo Reply from one of BFERs to which the probe has * a positive Echo Reply from one of BFERs to which the probe has
been sent. In this case, the bit corresponding to the BFER MUST been sent. In this case, the bit corresponding to the BFER MUST
be cleared from the BMS; be cleared from the bitmask string (BMS);
* a negative Echo Reply with bit string listing unreached BFERs and * a negative Echo Reply with bit string listing unreached BFERs and
recommended MTU value MTU'. The BFIR MUST add the bit string to recommended MTU value MTU'. The BFIR MUST add the bit string to
its BMS and set the size of the next probe as min(MTU, MTU') its BMS and set the size of the next probe as min(MTU, MTU')
If upon expiration of the Echo Request timer BFIR didn't receive any If a negative Echo Reply is received, the BFIR MUST wait for the
Echo Replies, then the size of the probe SHOULD be decreased. There expiration of the Echo Request before transmitting the updated Echo
are scenarios when an implementation of the PMTUD would not decrease Request. If upon expiration of the Echo Request timer BFIR didn't
the size of the probe. For example, suppose upon expiration of the receive any Echo Replies, then the size of the probe SHOULD be
Echo Request timer BFIR didn't receive any Echo Reply. In that case, decreased. There are scenarios when an implementation of the PMTUD
BFIR MAY continue to retransmit the probe using the initial size and would not decrease the size of the probe. For example, suppose upon
MAY apply probe delay retransmission procedures. The algorithm used expiration of the Echo Request timer BFIR didn't receive any Echo
to delay retransmission procedures on BFIR is outside the scope of Reply. In that case, BFIR MAY continue to retransmit the probe using
this specification. The BFIR sends probes using BMS and locally the initial size and MAY apply probe delay retransmission procedures.
defined retransmission procedures until either the bit string is The algorithm used to delay retransmission procedures on BFIR is
clear, i.e., contains no set bits, or until the BFIR retransmission outside the scope of this specification. The BFIR sends probes using
procedure terminates and PMTU discovery is declared unsuccessful. In BMS and locally defined retransmission procedures, but not more
the case of convergence of the procedure, the size of the last probe frequently than after the Echo Request timer expired, until either
indicates the PMTU size that can be used for all BFERs in the initial the bit string is clear, i.e., contains no set bits, or until the
BMS without incurring fragmentation. BFIR retransmission procedure terminates and PMTU discovery is
declared unsuccessful. In the case of convergence of the procedure,
the size of the last probe indicates the PMTU size that can be used
for all BFERs in the initial BMS without incurring fragmentation.
Thus we conclude that in order to comply with the requirement in Thus we conclude that in order to comply with the requirement in
[I-D.ietf-bier-oam-requirements]: [I-D.ietf-bier-oam-requirements]:
* a BFR SHOULD support PMTUD; * a BFR SHOULD support PMTUD;
* a BFR MAY use defined per BIER sub-domain MTU value as initial MTU * a BFR MAY use defined per BIER sub-domain MTU value as initial MTU
value for discovery or use it as MTU for this BIER sub-domain to value for discovery or use it as MTU for this BIER sub-domain to
reach BFERs; reach BFERs;
 End of changes. 18 change blocks. 
73 lines changed or deleted 51 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/