[Int-area] Comments for draft-ietf-intarea-tunnels-02

Vincent Roca <vincent.roca@inria.fr> Wed, 08 June 2016 14:44 UTC

From: Vincent Roca <vincent.roca@inria.fr>
Content-Type: multipart/signed; boundary="Apple-Mail=_9CA745A3-42AA-4693-863C-6E8DDBF64F10"; protocol="application/pgp-signature"; micalg="pgp-sha512"
Date: Wed, 08 Jun 2016 16:44:44 +0200
Message-Id: <DC49264E-9590-4B63-B1B6-E87F486114C2@inria.fr>
To: Joe Touch <touch@isi.edu>, townsley@cisco.com
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/aDw2dV3m8rjjG_OtV0rNEhw6tg4>
Cc: int-area@ietf.org
Subject: [Int-area] Comments for draft-ietf-intarea-tunnels-02
Precedence: list

Hello everybody,

First of all, I need to say that I found the draft-ietf-intarea-tunnels-02 I-D extremely useful
and wish I found it before.

We exchanged a few private emails with Joe last week and I promised to send detailed
comments on the list. Here they are... Sorry if this is a bit long.

** Section 2.2. Terminology, Link definition: it is said:
"Link: a communication device that transfers messages between network devices..."
The notion of network device is undefined. I guess you mean "network nodes" defined above.

** In Ingress and Egress definitions: it is said:
"Ingress: a network node that..."
The Ingress (resp. Egress) is not a network node. Section 3.4 has a better definition
IMHO. So:

OLD:
o Ingress: a network node that receives messages, encapsulates them
according to the tunnel protocol, and transmits them into the
tunnel. Note that the ingress and source can be co-located.

NEW:
o Ingress: a tunnel entry endpoint at the network node that the tunnel
interconnects. It it typically described as "network interface"."
The Ingress receives messages, encapsulates them according to the tunnel
protocol, and transmits them into the tunnel. Note that the ingress and
source can be co-located.

(similar changes for the Egress definition)

** In Egress definition: "The egress decapsulates datagrams..."
Maybe "IP datagrams" to be coherent with other uses of this term.

** I suggest adding the min LinkMTU values as they are considered throughout this I-D.
I also noticed that "LinkMTU" is used in the I-D to denote this value, not "LMTU".

OLD:
o Link MTU (LMTU): the largest message that can transit a link. Note
that this need not be the native size of messages on the link.

NEW:
o Link MTU (LinkMTU): the largest message that can transit a link. LinkMTU is
at minimum equal to 68 octets with IPv4 [RFC791] or 1280 octets with IPv6 [RFC2460].
Note that this need not be the native size of messages on the link.

** In RMTU definition:
"receiver" is ambiguous. I understand that definitions can be generic, however
throughout this I-D we only consider the Egress Reassembly MTU so it's worth
specializing.
I also clarified the case of encapsulation header size that is deducted.

OLD:
o Reassembly MTU (RMTU): the largest message that can be reassembled
by a receiver, and is not directly related to the link or path
MTU. Sometimes also referred to as "receiver MTU".

NEW:
o Reassembly MTU: the largest message that can be reassembled
by a receiver, i.e., either the Egress or the Destination. It is not
directly related to the link or path MTU. Sometimes also referred to
as "receiver MTU".

o Egress Reassembly MTU (EgressRMTU): the largest message that can be reassembled
by the Egress. The minimum EgressRTMU value is 576 octets with IPv4 [RFC791]
and 1500 octets with IPv6 [RFC2460] minus the encapsulation header size.

** Path MTU definition: I think it's worth clarifying that we focus on the tunnel PMTU in this I-D.
I also explain that the encapsulation header size has already been subtracted from the
Tunnel PMTU.

OLD:
o Path MTU (PMTU): the largest message that can transit a path.
Typically, this is the minimum of the link MTUs of the links of
the path.

NEW:
o Path MTU: the largest message that can transit a path.
Typically, this is the minimum of the link MTUs of the links of
the path.

o Tunnel Path MTU (TunnelPMTU): the largest Tunnel Transit Packet (or fragment)
that can transit inside the tunnel's path. The encapsulation header size is already
subtracted from TunnelPMTU and this TunnelPMTU is not visible from outside the
tunnel, unlike the Tunnel MTU.

** In tunnel MTU definition:
Say explicitly that the encapsulation header size has already been subtracted from
the Tunnel MTU, since encapsulation headers are considered as "headers from lower
protocols" and therefore excluded.

OLD:
o Tunnel MTU (TMTU): the largest message that can transit a tunnel.
Typically, this is limited by the egress reassembly MTU.

NEW:
o Tunnel MTU (TunnelMTU): the largest message that can transit a tunnel.
The encapsulation header size is already subtracted from TunnelMTU.
Typically, the TunnelMTU is limited by the EgressRMTU.

** I think a figure like this one (70 characters width ;-) can be useful. At least it
helped me a lot!

Tunnel Transit
Packet (TTP) or Tunnel Link Packet
"tunneled pkt" +------------+ (TLP) or +------------+
--------------->|Network Node| "tunnel packet" |Network Node| --->
| +-------| ________________ |-------+ |
| |Ingress|-|________________|->| Egress| |
+----+-------+ +-------+----+
Tunnel MTU Tunnel Path MTU Egress Reassembly
or Link MTU MTU
with:
Link MTU == Tunnel MTU == Egress RMTU
(the maximum encapsulation header size is already deducted)

** In the doc there is a misspelling, the correct acronym is PLPMTUD (two P's),
not PLMTUD (5 occurrences).

** Section 3.1: this is a detail. It is said:
"... the tunnel serves as a link to the devices it connects (here, Ra and Rb)."
Instead of "device" that is not defined in section 2.2, I think "network nodes" would
be more appropriate.

** Section 4.1, Fragmentation:
I suggest to change the algorithms along with their introduction text as follows:

// VR: added a reminder that LinkMTU == TunnelMTU
These rules apply at the host/router where the tunnel is attached (remind that the
Link MTU is equal to the Tunnel MTU):

// VR: I've changed the test order to follow the same logic in the two algorithms
// VR: TTPsize is the official name, so I've removed TTP and sizeof(TTP)
if (TTPsize <= linkMTU) then
send TTP into the tunnel "interface" (i.e., ingress)
else
if (TTP can be fragmented, e.g., IPv4 DF=0) then
// VR: as we test against linkMTU, it's better to mention linkMTU below,
// not TunMTU even if both are the equal
split TTP into fragments of linkMTU size
and send each fragment into the tunnel ingress
else
// VR: it's important to detail how the MTU field of the ICMP PTB is initialized
drop TTP and send ICMP "too big" to TTP source that
advertises Next-Hop MTU = linkMTU
endif
endif

// VR: added a reminder that TunnelMTU == EgressRMTU
These rules apply at the tunnel ingress of the host/router where the
tunnel is attached (remind that the Tunnel MTU is equal to the Egress
Reassembly MTU):

// VR: as said above, TTPsize, not sizeof() operator
if (TTPsize <= TunnelPathMTU) then
encapsulate TTP as received and emit
else
// VR: this test is sufficient and IMHO more lisible
// VR: I prefer to use TunnelMTU since we are at the ingress
if (TTPsize <= TunnelMTU) then
// VR: there was a mistake below that mentioned TunnelMTU (!) chunks.
fragment TTP into chunks of size TunnelPathMTU
// VR: there was a mistake below that mentioned TTP instead of TTP chunks
encapsulate and emit each TTP chunk
else
{never happens; host/router already dropped by now}
endif
endif

NB: I personally prefer :
if
else if
else
endif
to nested if's, but this is not a big deal.

** Same section 4.1
The 3rd algorithm should be better introduced. For instance it is not clear
that this is just an example for algorithm 2, using minimum values.
Also it is not clear reading this paragraph if:
option 1: the tunnel Path MTU must be >= (1280 - 40 - TOptSz), or
option 2: the tunnel Path MTU must be >= 1280, to which one needs to
subtract (40 + TOptSz) because of encapsulation
This is option1 if we compare with previous algorithm, but it's far from obvious...
It comes from the fact we compare with the TTP size before encapsulation. If
we compare with the size of encapsulated TTP, it would be different.

Globally, when to count or ignore encapsulation headers in MTUs is extremely
subtle and error prone. This I-D must avoid any ambiguity of this kind.

Here is a proposal:

OLD:
For IPv4 or IPv6 over IPv6, the tunnel path MTU is a minimum of 1280
minus the encapsulation header (40 bytes) with its options (TOptSz)
and the egress reassembly MTU is 1500 minus the same amount:

NEW:
As an example let us consider IPv4 over IPv6, or IPv6 over IPv6 tunneling,
where IPv6 encapsulation adds a 40 byte fixed header plus options (i.e.,
header extensions) of total size TOptSz. From [RFC2460] it follows that the
Tunnel MTU must be at least 1280 bytes and the Egress Reassembly MTU must
be at least 1500 - (40 + TOptSz) bytes. The Tunnel Path MTU must be a minimum
of 1280 - (40 + TOptSz) bytes. Considering these minimum values, the previous
algorithm becomes:

** Still section 4.1: we cannot say that 1500 is the minimum EgressRMTU as
we need to remove the encapsulation header size before as detailed above.
I also tried to explain why... Not easy! Do you have a better explanation?

NEW:
When using IP directly over IP, the minimum Egress Reassembly MTU
equals (576 - encapsulation header size) bytes for IPv4 [RFC791] and (1500 -
encapsulation header size) bytes for IPv6 [RFC2460].
Note that the encapsulation header size must be deducted from the 576 (resp. 1280)
value because [RFC791] (resp. [RFC2460]) requires the destination (here the egress)
to "be able to accept a fragmented packet that, after reassembly, is as large as 576
(resp. 1500) octets". In our case the reassembled packet corresponds to the encapsulated
TTP (i.e., packets (a) and (c) of Fig. 10) that was fragmented by the Ingress.

** Section 5.2: I fully agree with:
"Detect when the egress MTU drops below the required minimum and shut
down the tunnel if that happens - configuring the tunnel down and
issuing a hard error may be the only way to detect this anomaly, and
it's sufficiently important that the tunnel SHOULD be disabled."

However this not only an implementation aspect. This is also required for security
reasons and therefore it should be discussed in the Security Considerations section
as well. E.g., we observed opposite behaviors from an on-the-shelf IPsec
implementation and discussed consequences in our (now expired)
draft-roca-ipsecme-ptb-pts-attack-00.

[WG: we already discussed this aspect in private emails with Joe last week.]

** Section 5.4.4: it is said, for the "consistent with this doc" part, that:
"Shuts the tunnel down if the tunnel path MTU isn't => 1280."
This contradicts the example of section 4.1 where TunnelPMTU == 1280-40-TOptSz
is valid...
Do you mean "Shuts the tunnel down if the Tunnel MTU, seen from the outside network,
isn't >= 1280"?

** Section 6, Security considerations
As already discussed privately with Joe, I may have other items but still need
to think it over.

** Section A.1: the following sentence is ambiguous:
OLD:
"When the encapsulated packet exceeds the MTU of the tunnel, the
packet needs to be fragmented."
This is the Tunnel Path MTU, not MTU of the tunnel which can be understood as
synonymous to the Tunnel MTU.

NEW:
"When the packet (iH, iD) size exceeds the Tunnel Path MTU, the encapsulated packet
needs to be fragmented."

** Section A.1/ A.2:
Discussion in section 4.1 assumes an Outer Fragmentation scheme. The notion of
Egress Reassembly also assumes an Outer Fragmentation. In fact it seems to be
the model assumed throughout this I-D but this is not clearly stated unless I missed
something. Am I right?

It's clearly more universal as it does not make any assumption on the inner packet
(can it be fragmented on path or not, as explained in A.2).
I think the I-D should identify from the beginning the assumption made (Outer vs.
Inner fragmentation) as it will impact the discussion. Said differently move Appendix A
in Section 3 for instance.

Cheers,

Vincent

Attachment: signature.asc

Re: [Int-area] Comments for draft-ietf-intarea-tu… Templin, Fred L
Re: [Int-area] Comments for draft-ietf-intarea-tu… Joe Touch
Re: [Int-area] Comments for draft-ietf-intarea-tu… Templin, Fred L
Re: [Int-area] Comments for draft-ietf-intarea-tu… Joe Touch
Re: [Int-area] Comments for draft-ietf-intarea-tu… Templin, Fred L
Re: [Int-area] Comments for draft-ietf-intarea-tu… Joe Touch
Re: [Int-area] Comments for draft-ietf-intarea-tu… Templin, Fred L
Re: [Int-area] Comments for draft-ietf-intarea-tu… Templin, Fred L
Re: [Int-area] Comments for draft-ietf-intarea-tu… Vincent Roca
Re: [Int-area] Comments for draft-ietf-intarea-tu… Vincent Roca
[Int-area] Comments for draft-ietf-intarea-tunnel… Vincent Roca
Re: [Int-area] Comments for draft-ietf-intarea-tu… Joe Touch
Re: [Int-area] Comments for draft-ietf-intarea-tu… Vincent Roca
Re: [Int-area] Comments for draft-ietf-intarea-tu… Templin, Fred L
Re: [Int-area] Comments for draft-ietf-intarea-tu… Joe Touch

[Int-area] Comments for draft-ietf-intarea-tunnels-02

Attachment: signature.asc