[quicwg/base-drafts] QUIC's Initial Congestion Window specification is incorrect (#3997)

ianswett <notifications@github.com> Thu, 13 August 2020 19:54 UTC

Return-Path: <noreply@github.com>
X-Original-To: quic-issues@ietfa.amsl.com
Delivered-To: quic-issues@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 93F093A10E5 for <quic-issues@ietfa.amsl.com>; Thu, 13 Aug 2020 12:54:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.101
X-Spam-Level:
X-Spam-Status: No, score=-3.101 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=github.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1U7y_oYwY4dw for <quic-issues@ietfa.amsl.com>; Thu, 13 Aug 2020 12:54:02 -0700 (PDT)
Received: from out-24.smtp.github.com (out-24.smtp.github.com [192.30.252.207]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A9D193A10DC for <quic-issues@ietf.org>; Thu, 13 Aug 2020 12:54:02 -0700 (PDT)
Received: from github-lowworker-3a0df0f.ac4-iad.github.net (github-lowworker-3a0df0f.ac4-iad.github.net [10.52.25.92]) by smtp.github.com (Postfix) with ESMTP id EF148600E40 for <quic-issues@ietf.org>; Thu, 13 Aug 2020 12:54:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com; s=pf2014; t=1597348441; bh=B9usOcKK72LVEHfb0C8ZF0CvCSJIQ4gKXOqtmDZAq2Q=; h=Date:From:Reply-To:To:Cc:Subject:List-ID:List-Archive:List-Post: List-Unsubscribe:From; b=xF21df3QOMeHn5xcX32wUoypSIhuaxbAZCqBsrz1M8OOQaR71f7pt3nz0tpAxBGWi UROfqovyE9kIk4/0fWCGgZKbefhCsdkssUlD2kcMMi7XE7V2jU8yqUJvoM4euFVa1O j3TuCmMv0n3bosI4jYE/M+gAl1TH5W8wr1/3LcGE=
Date: Thu, 13 Aug 2020 12:54:01 -0700
From: ianswett <notifications@github.com>
Reply-To: quicwg/base-drafts <reply+AFTOJK4SA6YHRLVJF3I6ZNN5IF5VTEVBNHHCQ5AU7A@reply.github.com>
To: quicwg/base-drafts <base-drafts@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <quicwg/base-drafts/issues/3997@github.com>
Subject: [quicwg/base-drafts] QUIC's Initial Congestion Window specification is incorrect (#3997)
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="--==_mimepart_5f359a59df8c1_49ce196450345"; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Precedence: list
X-GitHub-Sender: ianswett
X-GitHub-Recipient: quic-issues
X-GitHub-Reason: subscribed
X-Auto-Response-Suppress: All
X-GitHub-Recipient-Address: quic-issues@ietf.org
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic-issues/y51VuO08_zv_Q2u4AV-VLlbjtik>
X-BeenThere: quic-issues@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Notification list for GitHub issues related to the QUIC WG <quic-issues.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic-issues>, <mailto:quic-issues-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic-issues/>
List-Post: <mailto:quic-issues@ietf.org>
List-Help: <mailto:quic-issues-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic-issues>, <mailto:quic-issues-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Aug 2020 19:54:12 -0000

Broken off from Gorry's comments in #3992 per @martinthomson request

ISSUE:

/Endpoints SHOULD use an initial congestion window of 10 times the 
maximum datagram size (max_datagram_size), limited to the larger of 
14720 or twice the maximum datagram size./

- I would like to revist this. We talked in Montreal and at that time I 
understood the equivalence to TCP for the case where a large MSS was 
supported by the path, as per RFC6928. I have since revisited this topic 
and would like to suggest the present IETF advice for TCP is in fact 
wrong for the large initial MSS case, and that this draft should not 
perputate that mistake for QUIC. The issue comes when IW is initialiased 
for a path with a very large PMTU, but that PMTU is not in fact 
supported by the path.

- (i) I observe the TCP case where the path does actually support the 
large PMTU, and a receiver advertises an appropiately large MSS. The 
path then uses the large MSS naturally and all is OK, but stands the 
risk of (ii) below, since the path might not be the same as a previous 
case.

- (ii) if the receiver interface supports a large MTU, and the the 
receiver advertises a large MSS, but the sender does not have a large 
MTU, the advertised large MSS changes the IW, and can vastly increase 
the number of packets in the initial window. This was not intended. It 
should not happen by default and can cause congestion and increase 
latency. This is wrong.

- (iii) if the receiver interface supports a large MTU, and the the 
receiver advertises a large MSS, the sender has a large MTU, but the 
path does not support this large PMTU. Sending with the large MSS causes 
packet loss (or possibly IP-Frag if that was allowed). This was not 
intended, and may well predjudice performance. Retransmission with a 
more appropiate PMTU does not change the IW, which then sends too many 
segments/packets. For TCP it would probably have resulted in a RTO and 
collapsing cwnd. This can cause congestion and increase latency. This is 
wrong.

... So why was this was not seen as a real-life problem. I think the 
advice in RFC6928 should have considered the impact of PMTU failure, but 
I conclude it doesn't normally hurt TCP. At the time this was written, 
few interfaces really did support more than a 1500B MTU (it may still be 
so), and MSS was often effectively limited by the server (sometimes by 
config). For servers that did advertise a larger MSS, or where the path 
supports less than 1500B, then MSS-clamping by routers along a path 
would often have triggered. Still, the sender would normally receiver 
only a feasible advertised MSS.

... QUIC is different :-). There is no middlebox intervention for MSS 
clamping - therefore QUIC is unable to avoid (iii), and likely would be 
impacted by (ii). I therefore suggest that QUIC chooses either to 
eliminate the /or twice the maximum datagram size./ clause, **or** 
provides a requirement that if this datagram size is not confirmed, then 
the IW needs to be limited to 14720 B.

... Finally, I would expect QUIC to perform better if it were to set up 
the connection, and then immediately probe for the larger size, since 
DPLPMTUD is anyway needed to utliise a larger PMTU and avoid 
blackholing. However, I don't think we need to explain this in the ID.

---

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/quicwg/base-drafts/issues/3997