Mirja Kühlewind's Discuss on draft-ietf-6man-rfc1981bis-06: (with DISCUSS and COMMENT)

Mirja Kühlewind <ietf@kuehlewind.net> Mon, 08 May 2017 19:08 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Mirja Kühlewind <ietf@kuehlewind.net>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-6man-rfc1981bis@ietf.org, Ole Troan <otroan@employees.org>, 6man-chairs@ietf.org, otroan@employees.org, ipv6@ietf.org
Subject: Mirja Kühlewind's Discuss on draft-ietf-6man-rfc1981bis-06: (with DISCUSS and COMMENT)
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <149427050740.24107.6062280537375286614.idtracker@ietfa.amsl.com>
Date: Mon, 08 May 2017 12:08:27 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/0W5HvEkN9NdaBpKK826OG9zchoE>

Mirja Kühlewind has entered the following ballot position for
draft-ietf-6man-rfc1981bis-06: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-6man-rfc1981bis/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

I know this is a bis document and my discuss does not address any text
that changed in the bis version but given all previous discussion, I
would like to discuss the following text parts on statements regarding
retransmissions which don't seem to be appropriate for this document and
are partly even wrong.
In general it does not make a lot of sense to talk about retransmission
semantics in this document because this really depends on the upper layer
protocol, and I'm really not sure if any implementaiton of a reliable
transport does send retransmission based on the receptions of PTB
messages (if exposed) rather than only relying on it's own loss detection
mechanism. This discuss concerns a few sentence all over the document and
most parts of section 5.4. Details proposals below:

I propose to either remove this sentence, or at least reword to the
following (or something similar):
OLD:
"Retransmission should be done for only for those packets that are
   known to be dropped, as indicated by a Packet Too Big message."
NEW:
"The IP layer may indicate loss to the upper layer protocol of those
packets that are
   known to be dropped, as indicated by a Packet Too Big message."
Or MAY or SHOULD or MUST...?

Subsequently the following sentence should be removed as well:
"An upper layer must not retransmit data in response to an increase in
   the PMTU estimate, since this increase never comes in response to an
   indication of a dropped packet."

And here is the bigger change in section 5.4:
OLD
"When a Packet Too Big message is received, it implies that a packet
   was dropped by the node that sent the ICMPv6 message.  It is
   sufficient to treat this in the same way as any other dropped
   segment, and will be recovered by normal retransmission methods.  If
   the Path MTU Discovery process requires several steps to find the
   PMTU of the full path, this could delay the connection by many
round-
   trip times.

   Alternatively, the retransmission could be done in immediate
response
   to a notification that the Path MTU has changed, but only for the
   specific connection specified by the Packet Too Big message.  The
   packet size used in the retransmission should be no larger than the
   new PMTU."
NEW
"When a Packet Too Big message is received, it implies that a packet
   was dropped by the node that sent the ICMPv6 message.  A reliable 
   upper layer protocol will detect the loss of this segment, and
recover
   it by its normal retransmission methods.  Depending on the loss 
   detection method that is used by the upper layer protocol, this 
   could delay the connection by many round-trip times.

   Alternatively, the retransmission could be done in immediate
response
   to a notification that the Path MTU was decreased, but only for the
   specific connection specified by the Packet Too Big message.  The
   packet size used in the retransmission should be no larger than the
   new PMTU."

I don't understand the following paragraph. Can this be removed?
"Note: A packetization layer must not retransmit in response to
      every Packet Too Big message, since a burst of several oversized
      segments will give rise to several such messages and hence
several
      retransmissions of the same data.  If the new estimated PMTU is
      still wrong, the process repeats, and there is an exponential
      growth in the number of superfluous segments sent."

The following text is fine but probably is not needed if the whole
document is reworded accordingly to ensure that retransmissions are
solely the responsibility of the upper layer protocol: 
     "Retransmissions can increase network load in response to
      congestion, worsening that congestion.  Any packetization layer
      that uses retransmission is responsible for congestion control of
      its retransmissions.  See [RFC8085] for more information."

This can also be removed, because a reliable protocol that detected loss
and decided to send a retransmission, should and will do the same
processing as for all other retransmissions, e.g. reset the
retransmission time in TCP. Mentioning this separately is rather
confusing.
      "This means that the TCP layer must be able to recognize when a
      Packet Too Big notification actually decreases the PMTU that it
      has already used to send a packet on the given connection, and
      should ignore any other notifications."

And this is even incorrect. Slow start means that you will increase the
connection window exponentially. Only sending one segment means setting
the congestion/sending window to one. I propose the following change:
OLD
   "Many TCP implementations incorporate "congestion avoidance" and
   "slow-start" algorithms to improve performance [CONG].  Unlike a
   retransmission caused by a TCP retransmission timeout, a
   retransmission caused by a Packet Too Big message should not change
   the congestion window.  It should, however, trigger the slow-start
   mechanism (i.e., only one segment should be retransmitted until
   acknowledgements begin to arrive again)."
NEW
"A loss caused by a PMTU probe indicated by the reception of a Packet Too
Big message MUST NOT be considered as a congestion notification and hence
the congestion window may not change."

And I also don't understand this sentence:
"TCP performance can be reduced if the sender's maximum window size is
   not an exact multiple of the segment size in use (this is not the
   congestion window size)."


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

1) I agree with Ekr on this sentence:
"Nodes SHOULD appropriately validate the payload of ICMPv6 PTB
   messages to ensure these are received in response to transmitted
   traffic (i.e., a reported error condition that corresponds to an
IPv6
   packet actually sent by the application) per [ICMPv6]."
This sounds like it should be a MUST but I guess it depends on the upper
layer protocol if such a validation is possible or not, e.g. if
information are available that can be used for validation. Maybe you can
be more explicit here and even say something like pmtu discovery
should/must only be used if the upper layer protocol provides means for
validation of the icmp payload (like a sequence number in TCP)…?

Further also note that if the upper layer does the validation while the
IP layer maintains EMTU_S, there must be an interface from the upper
layer to the IP layer to tell if a packet is valid or not before the IP
layer updates the MTU estimate. This seems actually more complicated than
this one sentences indicates.

2) Also as Ekr says, I also have problems to fully understand this
normative text in section 4:
"After receiving a Packet Too Big message, a node MUST attempt to
   avoid eliciting more such messages in the near future.  The node
MUST
   reduce the size of the packets it is sending along the path.  Using
a
   PMTU estimate larger than the IPv6 minimum link MTU may continue to
   elicit Packet Too Big messages.  Since each of these messages (and
   the dropped packets they respond to) consume network resources, the
   node MUST force the Path MTU Discovery process to end.

   Nodes using Path MTU Discovery MUST detect decreases in PMTU as fast
   as possible."
I especially don't understand the first part, given that a PTB message
may still indicate a MTU that is larger than the minimum link MTU which
then may cause another PTB message later on the path. This text reads
like if you receive one PTB message you should better end discovery and
fall back to the minimum link MTU to avoid any further PTB message and
not waist any resources. I don't think that's the intention and as such I
don't understand when it is recommended to end discovery here...?

3) Section 5.2 seems to be written with only single homed hosts in mind.
It might be good to advise that the pmtu information should always be
stored on a per interface basis...?

4) Also section 5.2:
You only advise to store information per flow ID, however, if the flow
label is not used, wouldn't it make really sense to just use the 5-tuple
instead? Also note that EMCP is often done based on the 5-tuple or even
6-tuple (with the ToS field).

5) And more in section 5.2:
"When a Packet Too Big message is received, the node determines which
   path the message applies to based on the contents of the Packet Too
   Big message. "
MAYBE:
"When a valid Packet Too Big message is received, the node determines
which
   path the message applies to based on the contents of the Packet Too
   Big message."
And further on:
"If the tentative PMTU is less than the existing PMTU estimate, the
   tentative PMTU replaces the existing PMTU as the PMTU value for the
   path."
This doesn't cover the case where a pmtu probe with a larger size was
send and the PTB message returns a larger value then stored. Maybe state
this explicitly.

This applies similar to this sentence in section 6:
OLD
"A node, however, should never raise its estimate of the
      PMTU based on a Packet Too Big message, so should not be
      vulnerable to this attack.“
NEW
"A node, however, MUST NOT raise its estimate of the
      PMTU based on a Packet Too Big message that is not a (validated)
response to a PMTU probe that was previously send by this node, so should
not be
      vulnerable to this attack."

6) Further section 5.2:
Should this statement be maybe upper case MUST:
"The packetization layers must be notified about decreases in the PMTU.
"

7) Technical comment on section 5.3 in general:
There is a difference between aging if a flow is active or not. While I
maybe don't want to probe again for this connection because my
application already decided to use a mode where it can live with the
current pmtu and it's too much effect to switch, I really want to probe
at the beginning of the next connection again to check if I can use a
different mode now. While the IP layer does not have a notion of
connection it can observe if packets are frequently send with the same
5-tuple and reset the cached pmtu after a certain idle time.

8) Section 5.4: should this maybe be normative, at least the last MUST
NOT (be fragmented):
"A packetization layer (e.g., TCP) must track the PMTU for the path(s)
   in use by a connection; it should not send segments that would
result
   in packets larger than the PMTU, except to probe during PMTU
   discovery (this probe packet must not be fragmented to the PMTU). "


Nit:
The abbreviation PTB is only used once in section 4 (and never expanded).

Mirja Kühlewind's Discuss on draft-ietf-6man-rfc1… Mirja Kühlewind
Re: Mirja Kühlewind's Discuss on draft-ietf-6man-… Suresh Krishnan