WGLC comments: draft-ietf-quic-recovery-29
Gorry Fairhurst <gorry@erg.abdn.ac.uk> Wed, 01 July 2020 08:30 UTC
Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1FB643A0B1C for <quic@ietfa.amsl.com>; Wed, 1 Jul 2020 01:30:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NYHEXO6g6p1k for <quic@ietfa.amsl.com>; Wed, 1 Jul 2020 01:30:19 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id F10263A0B16 for <quic@ietf.org>; Wed, 1 Jul 2020 01:30:18 -0700 (PDT)
Received: from GF-MacBook-Pro.lan (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id DC0501B00331; Wed, 1 Jul 2020 09:30:15 +0100 (BST)
Subject: WGLC comments: draft-ietf-quic-recovery-29
To: quic@ietf.org
References: <159174926905.11646.1975231547639763889@ietfa.amsl.com>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <4228f506-9b96-4872-177b-120be77920f8@erg.abdn.ac.uk>
Date: Wed, 01 Jul 2020 09:30:15 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.9.0
MIME-Version: 1.0
In-Reply-To: <159174926905.11646.1975231547639763889@ietfa.amsl.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/jFRIW1Yh-VOqdbFovLfcutwkdt4>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Jul 2020 08:30:22 -0000
This is a WGLC review of draft-ietf-quic-recovery-29. This version of the specification seems mature, and this email includes some WGLC comments, preceded by some things that I think are issues. (Editors I could raise a set of github issues for the issues, please advise how you would like to proceed). Best wishes, Gorry ISSUE: 8.2. Traffic Analysis /Packets that carry only ACK frames can be heuristically identified by observing packet size. Acknowledgement patterns may expose information about link characteristics or application behavior. Endpoints can use PADDING frames or bundle acknowledgments with other frames to reduce leaked information./ - I think this needs a warning: This could also increase the return path traffic, which for asymmetric paths could impact the performance of the forward path or of other flows that share a restricted return path. ISSUE: /Re-ordering could be more common with QUIC than TCP, because network elements cannot observe and fix the order of out-of-order packets./ - They can if they add network-layer sequence numbers, as some tunnels/encaps for example can... This seems like an odd statement. Is it necessary? If so, please explain more. ISSUE: /When a loss or ECN-CE marking is detected, NewReno halves the congestion window, sets the slow start threshold to the new congestion window, and then enters the recovery period./ - The requirement is that TCP needs to reduce after CE, The RFC series does not now say it needs to halve, it could for example follow the reduction method specified in RFC8511. e.g. /When a loss or ECN-CE marking is detected, the sender must reduce the cwnd. NewReno halves the congestion window, sets the slow start threshold to the new congestion window, and then enters the recovery period. [RFC8511] specifies an alternate cwnd reduction./ ISSUE: Similar comment in 8.3: /Though congestion controllers generally treat reports of ECN-CE markings as equivalent to loss [RFC8311], the exact response for each controller could be different. / - This does not seem correct. Could I suggest: /Congestion controllers respond to reports of ECN-CE by reducing their rate. Markings can be treated as equivalent to loss [RFC3168], but other responses can be specified (e.g. [RFC8511]) [RFC8311]. / ISSUE: In B.5, // Congestion avoidance. congestion_window += max_datagram_size * acked_packet.size / congestion_window - is this calculation correct? I was thinking of what might happen when the PMTU is large and the sender generates a sequence of small packets… would this result in overestimating cwnd? ISSUE: /Endpoints SHOULD use an initial congestion window of 10 times the maximum datagram size (max_datagram_size), limited to the larger of 14720 or twice the maximum datagram size./ - I would like to revist this. We talked in Montreal and at that time I understood the equivalence to TCP for the case where a large MSS was supported by the path, as per RFC6928. I have since revisited this topic and would like to suggest the present IETF advice for TCP is in fact wrong for the large initial MSS case, and that this draft should not perputate that mistake for QUIC. The issue comes when IW is initialiased for a path with a very large PMTU, but that PMTU is not in fact supported by the path. - (i) I observe the TCP case where the path does actually support the large PMTU, and a receiver advertises an appropiately large MSS. The path then uses the large MSS naturally and all is OK, but stands the risk of (ii) below, since the path might not be the same as a previous case. - (ii) if the receiver interface supports a large MTU, and the the receiver advertises a large MSS, but the sender does not have a large MTU, the advertised large MSS changes the IW, and can vastly increase the number of packets in the initial window. This was not intended. It should not happen by default and can cause congestion and increase latency. This is wrong. - (iii) if the receiver interface supports a large MTU, and the the receiver advertises a large MSS, the sender has a large MTU, but the path does not support this large PMTU. Sending with the large MSS causes packet loss (or possibly IP-Frag if that was allowed). This was not intended, and may well predjudice performance. Retransmission with a more appropiate PMTU does not change the IW, which then sends too many segments/packets. For TCP it would probably have resulted in a RTO and collapsing cwnd. This can cause congestion and increase latency. This is wrong. ... So why was this was not seen as a real-life problem. I think the advice in RFC6928 should have considered the impact of PMTU failure, but I conclude it doesn't normally hurt TCP. At the time this was written, few interfaces really did support more than a 1500B MTU (it may still be so), and MSS was often effectively limited by the server (sometimes by config). For servers that did advertise a larger MSS, or where the path supports less than 1500B, then MSS-clamping by routers along a path would often have triggered. Still, the sender would normally receiver only a feasible advertised MSS. ... QUIC is different :-). There is no middlebox intervention for MSS clamping - therefore QUIC is unable to avoid (iii), and likely would be impacted by (ii). I therefore suggest that QUIC chooses either to eliminate the /or twice the maximum datagram size./ clause, **or** provides a requirement that if this datagram size is not confirmed, then the IW needs to be limited to 14720 B. ... Finally, I would expect QUIC to perform better if it were to set up the connection, and then immediately probe for the larger size, since DPLPMTUD is anyway needed to utliise a larger PMTU and avoid blackholing. However, I don't think we need to explain this in the ID. --- NiT: /ACK delay/ /Ack Delay/ and /ack delay/ are both used, it seems the /ACK/ is more consistent with other usage. NiT (Missing word): /and are expected to at least as useful in QUIC/and are expected to be at least as useful in QUIC/ REF: /When a PTO timer expires, the PTO backoff MUST be increased, resulting in the PTO period being set to twice its current value. The PTO backoff factor is reset when an acknowledgement is received, except in the following case./ - Please consider a reference to draft-ietf-tcpm-rto-consider-16, which provides BCP on use of timers? The life of a connection that is experiencing consecutive PTOs is limited by the endpoint's idle timeout. - what does /life/ mean here? /send an Initial packet in a UDP datagram of at least 1200 bytes./ - At what layer is the datagram size measured? Should this be a datagram with /payload 1200 bytes/? /Peers can also use coalesced packets to ensure that each datagram elicits / - A cross reference would be valuable to the section on /coalesced packets/ /If the sender wants to elicit a faster acknowledgement on PTO, it can skip a packet number to eliminate the ack delay./ - Explain: this causes the sender to see an out of order packet, which eliminates the ACK delay. /limited to the larger of 14720/ - Please add the word /bytes/? /Endpoints MAY ignore the loss of Handshake, 0-RTT, and 1-RTT packets that might have arrived before the peer had packet protection keys to process those packets. Endpoints MUST NOT ignore the loss of packets that were sent after the earliest acknowledged packet in a given packet number space./ - Can you clarify what is intended by the word /ignore/? This is in a section on CC, so I was hoping that the word ignore meant that the endpoint did not need to make a CC change, otherwise it MUST update the CC? /7.7. Probe Timeout Probe packets MUST NOT be blocked by the congestion controller. / - Can you clarify what is intended by the word /blocked/? I was assuming the transmission was not constrained by the congestion controller? - Would these packets consume flow credit, i.e. are they also not flow controlled? 7.8. Persistent Congestion - Could you add text explaining what happens? I think I understand, but to be clear. If the persistent congestion persists, then I think the congestion is not further reduced, but I would expect the PTO to back-off the interval between packets exponentially, is that true? - Are appendix A and B normative or informative? Section A.5. On Sending a Packet; and section A.6. On Receiving a Datagram. If the intention is to talk about datagrams in A.6, can A.5 explain that packets are sent in datagrams? Sections A.8, A.9, A.10 seem to be sender functions. It would perhaps avoid doubt to state this.
- I-D Action: draft-ietf-quic-recovery-29.txt internet-drafts
- WGLC comments: draft-ietf-quic-recovery-29 Gorry Fairhurst
- Re: WGLC comments: draft-ietf-quic-recovery-29 Jan Rüth
- Re: WGLC comments: draft-ietf-quic-recovery-29 Ian Swett