Re: Éric Vyncke's No Objection on draft-ietf-6man-mtu-option-13: (with COMMENT)

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Tue, 05 April 2022 15:24 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0936A3A0B19; Tue, 5 Apr 2022 08:24:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HWbOqFOrX-r1; Tue, 5 Apr 2022 08:23:57 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [137.50.19.135]) by ietfa.amsl.com (Postfix) with ESMTP id 31C713A0A8C; Tue, 5 Apr 2022 08:23:52 -0700 (PDT)
Received: from [192.168.1.64] (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 21F561B00238; Tue, 5 Apr 2022 16:23:10 +0100 (BST)
Message-ID: <4a49e03b-a43f-fe78-343b-9df9af4fa425@erg.abdn.ac.uk>
Date: Tue, 05 Apr 2022 16:23:09 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:91.0) Gecko/20100101 Thunderbird/91.7.0
Subject: Re: Éric Vyncke's No Objection on draft-ietf-6man-mtu-option-13: (with COMMENT)
To: Éric Vyncke <evyncke@cisco.com>, The IESG <iesg@ietf.org>
Cc: draft-ietf-6man-mtu-option@ietf.org, 6man-chairs@ietf.org, ipv6@ietf.org, otroan@employees.org, equinox@diac24.net
References: <164907873723.15991.57242493854295767@ietfa.amsl.com>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
In-Reply-To: <164907873723.15991.57242493854295767@ietfa.amsl.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/9up6p3dq18be--z78IzdRR1wZYQ>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Apr 2022 15:24:04 -0000


See below, marked GF+BH :

-------- Forwarded Message --------
Subject: Éric Vyncke's No Objection on draft-ietf-6man-mtu-option-13: 
(with COMMENT)
Resent-Date: Mon, 4 Apr 2022 06:25:37 -0700 (PDT)
Resent-From: alias-bounces@ietf.org
Resent-To: bob.hinden@gmail.com, gorry@erg.abdn.ac.uk
Date: Mon, 04 Apr 2022 06:25:37 -0700
From: Éric Vyncke via Datatracker <noreply@ietf.org>
Reply-To: Éric Vyncke <evyncke@cisco.com>
To: The IESG <iesg@ietf.org>
CC: draft-ietf-6man-mtu-option@ietf.org, 6man-chairs@ietf.org, 
ipv6@ietf.org, otroan@employees.org, otroan@employees.org, 
equinox@diac24.net


Éric Vyncke has entered the following ballot position for
draft-ietf-6man-mtu-option-13: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to 
https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ 
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-6man-mtu-option/



----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Thank you for the work put into this document. I am all in favour of 
extending
IPv6 with extension headers, and any system to improve path MTU detection is
helpful (esp. in data center at the beginning).

Let me also apologise to Bob, Gorry, and the 6MAN WG as I could have 
sent those
comments during the WGLC...

Please find below some non-blocking COMMENT points (but replies would be
appreciated even if only for my own education), and some nits.

Special thanks to Ole Trøan for the shepherd's write-up including the
justification for the intended status as experimental. You may also expect a
INT directorate review by David Lamparter later.

I hope that this helps to improve the document,

Regards,

-éric

What about multicast traffic ? Can this HbH option be used ? How can several
answers be combined ? As the intended status is "experimental", this is not
blocking but close to be a DISCUSS-level point though.

GF: There would be additional considerations about scaling responses and 
how to aggregate responses, with all the usual multicast design caveats 
and the question of what is best at what scale. We expect this is best 
handled as a separate short draft, should someone find a use for a 
multicast case.  We could add text to explicitly say that we do not 
specify the multicast use in the document.

======

I think it is OK to say that an experimenter could
permit this for multicast with CAVEATs: First, understanding that each 
endpoint
will reply, and that therefore a multicast destination should randomly 
delay their
reply to ensure that large numbers of replies are not received in a 
short interval.
Second, that the source needs a way to combine multiple different 
replies. As in
other multicast methods, this will require a decision of whether to 
accept and use
the lowest detected minMTU,or whether to then trigger a control protocol 
action
to expell members of the group with an unacceptably low MinMTU. These 
tradeoffs
and the security implications are expected to depend on the design of 
the multicast
transport being used, and the size of the group. They are not discussed 
in this
document.

I think all we should say is this draft is focused on Unicast.


## Abstract

Just wondering whether "node" should be used instead of "host" in "along the
forward path between a source host to a destination host" (also in other 
places
in other sections). Of course, nothing prevent a router to act as a host. §9
uses "node" and not "host" ;-)

GF+BH: We prefer not to make this change, because RFC8200 says:

node a device that implements IPv6.

router a node that forwards IPv6 packets not explicitly
addressed to itself. (See Note below.)

host any node that is not a router. (See Note below.)

Our intent is that this be sent and received by hosts. I think it’s 
clearer to continue to use host.

## Section 1.2

Suggest removing the note about RFC 2460 as 8200 has been published for 
years
now.

GF+BH: We think there is some value in keeping this history.

Even if mostly obvious, please expand "HBH" at first use.
GF+BH: Will do

## Section 6.1

For a router not configured for HbH-processing: why only "SHOULD ignore" ?
Either exception use case(s) should be provided or a "MUST NOT" and 
"MUST" (for
forward) be used.

GF+BH: The SHOULD language doesn’t work perfectly here.
I guess it's really a statement of fact "will ignore". Not sure it is 
worth changing.

Why does a router "SHOULD" only update and not "MUST" ? This is an 
experimental
document and not a proposed standard one so little reason to be 
ultra-cautious.
If "SHOULD" is kept, then when can/should a router deviate from the update
action ?

GF+BH:: We are happy to say MUST, i.e.if it implements this RFC it needs to.

Is the "Discussion" part still relevant at this stage (IESG evaluation) ? Or
should it be moved to the "experiment success evaluation" part ?

GF+BH: Either way for us. We will remove it.

Can the router apply sampling/rate limiting on those packets ?

GF+BH: It can, if used carefully. The result is loss, which might impact 
usefullness, but ought not
impact the correct function. Do we need to add something?

## Section 6.2

"This cached value can be used by other flows that share the host's 
destination
cache." is hard to parse and possibly incorrect (as missing the egress
interface), suggest to use "other flows to the same destination and same 
egress
interface" ?

===

GF+BH: Probably good to clarify. Indeed,it needs to be balanced with my 
ECMP text
added to 8201 that warns that paths are not just identified by address 
(see below)

If it was not an experimental document, I would probably have raised a 
blocking
DISCUSS (sorry Bob & Gorry), usually ECMP is done on the 5-tuple, so using
different layer-4 ports could end up in slightly different paths with 
different
MTU (section 5.2 of RFC 8201 is a little better, recommend referring to 
it ? --
it is only referred to in § 6.3.4).

GF+BH: We agree. It's important it is used to *initialise* a probe with 
the expected PMTU
size, and that the option is not to blindly used to set the PMTU, 
because the actual
PMTU can depend on the forwarding path for the flow, which can be 
influenced by
port information, flow lable, and other information besides the 
destination address.

Please expand "PL"

GF+BH: Will Fix - > PL = Packetization Layer.

"When requested to send an IPv6 packet" how ? and who request such an 
action ?
My major concern is whether it is a per packet or per "connection" 
request as
using a 8-byte MTU in a data packet actually reduces the useful MTU by 8 
bytes.
A forward reference to §6.3.1 would be beneficial.

GF+BH: My take is the transport has to generate the reply.

Bullet #3, it is unclear what "This" means.

GF+BH: Whoops we wanted to remove this bullet: /3. This sends a response 
probe back to source/.

===
## Section 6.3

"Using a PMTU Probe" is it the HbH option described in this document ? 
If so,
then propose being clear or introduce the synonym earlier in the text.

GF+BH: Transport guy problem, mea culpa...we should have explained "a 
PMTU Probe".

We suggest a complete replacement:

PLPMTUD use probe packets for two distinct functions:
• Probe packets are used to confirm connectivity. Such probes can be of 
any size
up to the PLPMTU. These probe are sent to solicit a
response use the path to the remote node. These probe can carry this HBH 
option,
providing the final size of packet does not exceed the current PLPMTU.
After validating that the packet orignates from the path
(section 4.6.1), the PLPMTUD method can use the reported size from the 
HBH option as
the next search point when it resumes the search algorithm.
(This use resembles the use of the PTB_SIZE information in section 
4.6.2of [RFC8889

• A second use of probe packets is to explore if a path supports a 
packet size greater than the current PLPMTU. If this probe packet is 
successfully delivered (as determined by the source host), then the 
PLPMTU is raised to
the size of the successful probe. These probe packets do not usually set 
the HBH option. See section 1.2 of [RFC8899].

Section 4.1 of [RFC8899] also describes ways that a Probe Packet can be 
constructed, depending on whether the probe packets carry application data.

• The PMTU Probe can be sent on packets that include application data, 
but needs to be robust to potential loss of the packet (i.e. with the 
possibility that retransmission might be needed if the packet is lost).

• Using a PMTU Probe on packets that do not carry application data will 
avoid the need for loss recovery if a router on the path drops packets 
that set this option. (This avoids the transport needing to retransmit a 
lost packet that includes this option.) This is the normal default 
format for both uses of probes.

===

## Section 6.3.2

Just wondering how different this method is wrt to ICMP-based PMTUD as the
5-tuple must also be present (albeit no data).

GF+BH: Not sure what to write in the ID, but the real benefit of PLPMTUD 
is it can avoid blackholing of data by PMTUD relying on ICMP,without 
regressing to using a minPMTU.

The drawback of PLPMTUD is it can require cycles of pobing using 
sacrifical packets to be sure the PMTU is not going to be blackholed, 
and the larger the PMTU above the default PMTU, the larger the number of 
RTT cycles, and hence the larger time to converge on safely using a 
larger PMTU. The HBH method seeks to complete in 2 RTT-cycles, ask for 
the PMTU across the path, confirm this size works.

Of course, if the discovered PMTU isn't actuallysupported by the path, 
then the HBH information did not help, and the larger probe either 
generates a PTB packet (one more RTT to converge) or is dopped. This 
takes at least one more cycle of probing to deduce the PMTU. It's hard 
work to robustly fix broken-ness.

GF+BH: Does this need a change to the text?

===

Also wondering how an upper-layer protocol (possibly QUIC in user space) 
could
signal to the PMTU cache (possibly in the kernel) to ignore a value. 
But, hey
this is all about experimenting ;-) And § 6.3.3 is going in more details 
about
where this data could be stored.

GF:That's the way (D)PLPMTUD is designed in QUIC, although I suspect many
QUIC flows simply don't update the IP layer PMTU cache at all, and 
simply use their
own ways to store previous results and figure out how to size their 
packets-pushing
a value bact to the IP layer cache may seem to have very little benefit.

===

## Section 6.3.4

Why not a "MUST" in "A source host SHOULD ignore a Rtn-PMTU value larger 
than
the MTU configured for the outgoing link." ?

GF+BH: MUST is good, will fix. We should have caught that.

===

# NITS

## Section 1.1

In scenario 2, s/considers the link to the destination host/considers 
the link
between R2 and the destination host/ ?

## Section 2

s/977K packets per second (pps)/977,000 packets per second (or 997 kpps)/

## Section 6.3.4

s/layer 2 device/layer-2 device/

## Section 8.1

s/Hop by Hop/Hop-by-Hop/

GF+BH: All good, will fix.
===

If these proposed changes seem OK (with any comments), please let us 
know if you'd like us to prepare a new revision or wait to collect 
further feedback from the review process.

Best wishes,

Bob & Gorry
(editors)