Review of draft-ietf-quic-recovery-08

Praveen Balasubramanian <pravb@microsoft.com> Sat, 20 January 2018 02:09 UTC

Return-Path: <pravb@microsoft.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6B98D1273B1 for <quic@ietfa.amsl.com>; Fri, 19 Jan 2018 18:09:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.02
X-Spam-Level:
X-Spam-Status: No, score=-2.02 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=microsoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HUGomALS2g9S for <quic@ietfa.amsl.com>; Fri, 19 Jan 2018 18:09:04 -0800 (PST)
Received: from NAM03-DM3-obe.outbound.protection.outlook.com (mail-dm3nam03on0129.outbound.protection.outlook.com [104.47.41.129]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8B70D124BAC for <quic@ietf.org>; Fri, 19 Jan 2018 18:09:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=UTwEBbW9dbaPL+UoTlGybKtarG33WONkv6kXJ05Xbxg=; b=jvJI8/5ZSy9Y2SSw3kerVVKLKsMk12pB/Op0RuybsLEFW8XYsyCeZqagCAa9whtE11N228SlSXhL2v7vu5JruwRqtV/l+QwkllE1UCIUYp3G3BaJzObU5vBG4/ql9pYnd/ml2usbZO59orws9wCPLNHM5HgWg6f6rKCrG0EOSWU=
Received: from DM5PR21MB0140.namprd21.prod.outlook.com (10.173.173.15) by DM5PR21MB0827.namprd21.prod.outlook.com (10.173.172.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.444.7; Sat, 20 Jan 2018 02:09:02 +0000
Received: from DM5PR21MB0140.namprd21.prod.outlook.com ([10.173.173.15]) by DM5PR21MB0140.namprd21.prod.outlook.com ([10.173.173.15]) with mapi id 15.20.0444.004; Sat, 20 Jan 2018 02:09:02 +0000
From: Praveen Balasubramanian <pravb@microsoft.com>
To: IETF QUIC WG <quic@ietf.org>
Subject: Review of draft-ietf-quic-recovery-08
Thread-Topic: Review of draft-ietf-quic-recovery-08
Thread-Index: AdORjmRPMviMG51TQciBUgn+esABGQ==
Date: Sat, 20 Jan 2018 02:09:02 +0000
Message-ID: <DM5PR21MB0140970AEECE237E9F02D131B6EE0@DM5PR21MB0140.namprd21.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [2001:4898:80e8:f::712]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; DM5PR21MB0827; 7:1zclbYRkpuo42K7GRxBiwgb3MYV7J1RA5syNyMohlUJLqx2Zq13OoD56zvtAsNiSdJfnid0RY+P7X8M4juESlybJ+H0hMvuTnaTF42jzn4txDnDd+gW/y9jUarU7YpDxCxbI83Z3Eyg2ziIY026LorJ8xtUfAILMuxwfCu5nKSiRocsHPGBLuJBff3noTmu1/RKbptWEgytQdzO7CWd+iTww792Wq50bFaDb+n2bRJ6ow3gEkTHLw27VWDsMQrDC
x-ms-exchange-antispam-srfa-diagnostics: SSOS;
x-ms-office365-filtering-correlation-id: 5164f7e4-9c5a-41a5-d5e6-08d55faac6f1
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4604075)(3008032)(4534125)(4602075)(4627221)(201703031133081)(201702281549075)(48565401081)(2017052603307)(7193020); SRVR:DM5PR21MB0827;
x-ms-traffictypediagnostic: DM5PR21MB0827:
x-microsoft-antispam-prvs: <DM5PR21MB08272132D6B8BA5260B58AF1B6EE0@DM5PR21MB0827.namprd21.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(28532068793085)(120809045254105)(21748063052155);
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(61425038)(6040501)(2401047)(8121501046)(5005006)(93006095)(93001095)(3002001)(10201501046)(3231046)(2400081)(944501161)(6055026)(61426038)(61427038)(6041288)(20161123562045)(20161123560045)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(6072148)(201708071742011); SRVR:DM5PR21MB0827; BCL:0; PCL:0; RULEID:(100000803126)(100110400120); SRVR:DM5PR21MB0827;
x-forefront-prvs: 0558D3C5AC
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(376002)(39860400002)(396003)(366004)(346002)(39380400002)(189003)(199004)(966005)(10290500003)(230783001)(10090500001)(9686003)(236005)(478600001)(33656002)(86612001)(6436002)(790700001)(54896002)(55016002)(53936002)(6116002)(97736004)(6306002)(8936002)(8990500004)(2906002)(25786009)(6346003)(2900100001)(74316002)(316002)(68736007)(102836004)(3280700002)(6506007)(81166006)(77096007)(22452003)(86362001)(14454004)(7736002)(8676002)(81156014)(5660300001)(606006)(99286004)(6916009)(3660700001)(59450400001)(106356001)(7696005)(105586002); DIR:OUT; SFP:1102; SCL:1; SRVR:DM5PR21MB0827; H:DM5PR21MB0140.namprd21.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en;
received-spf: None (protection.outlook.com: microsoft.com does not designate permitted sender hosts)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=pravb@microsoft.com;
x-microsoft-antispam-message-info: +lwopWgiDdtXBcg0g9F66wa4CVBweBRzwYCjCuG2CRYkvRoYIwqR+MhaiamjxR07FXW9IPoV5PrZJ1LrALfwZw==
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: multipart/alternative; boundary="_000_DM5PR21MB0140970AEECE237E9F02D131B6EE0DM5PR21MB0140namp_"
MIME-Version: 1.0
X-OriginatorOrg: microsoft.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 5164f7e4-9c5a-41a5-d5e6-08d55faac6f1
X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Jan 2018 02:09:02.7678 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR21MB0827
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/24RwlATPjKN_XZi7Qo_J99JxwDE>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Jan 2018 02:09:07 -0000

Section 2.1.4.

"This mechanism also allows a receiver to measure and report the
   delay from when a packet was received by the OS kernel, which is
   useful in receivers which may incur delays such as context-switch
   latency before a userspace QUIC receiver processes a received packet.
"
Such context switch delays are possible in a kernel implementation as well based on whether execution is deferred to worker threads. Suggest that we remove the userspace word.


Section 3.1

"Ignoring ack delay for min RTT prevents

   intentional or unintentional underestimation of min RTT, which in

   turn prevents underestimating smoothed RTT."
Does this prevent converging to the actual minimum RTT if the receiver is always delaying ACKs. I can't tell for sure if this will hurt congestion control algorithms like BBR and LEDBAT.


Section 3.2.1
It seems in general that the text does not specify the actions to take as a result of loss detection whereas the pseudocode does (for example cwnd reductions). Should we add a section on  "Congestion Control" before the pseudo code?

Section 3.3.1

"PTO SHOULD be scheduled for max(1.5*SRTT+MaxAckDelay, 10ms)"

https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-rack has an explanation for the 10ms value. Should it be a goal to explain the rationale for it here or do we point the reader to the RACK draft?

Section 3.3.2

"When this alarm fires, the sender sends two packets, to

   evoke acknowledgements from the receiver, and restarts the RTO alarm.

kMinimumWindow (default 2 * kDefaultMss):  Default minimum congestion

      window."


This implies that the slow start process post loss for a QUIC flow will be twice as aggressive as TCP sharing the same bottleneck link. Is it a goal that we want to fair share with TCP? If 2 is the right value of cwnd post timeout, should TCP implementations also switch to this? Seems like 2 versus 1 is worthy of an experiment if its not been done before. Is this better tracked as a github issue?


"QUIC's RTO algorithm differs from TCP in that the firing of an RTO
   alarm is not considered a strong enough signal of packet loss, so
   does not result in an immediate change to congestion window or
   recovery state"

I found this sentence a bit misleading and seems to imply that a genuine RTO does not alter congestion window state. The pseudocode does change the cwnd to kMinimumWindow post a confirmed RTO and I think it should.



Section 3 seems to be missing "Time Based Loss Detection" section but the pseudocode uses kUsingTimeLossDetection.



Section 3.4.7.3

"It
   is part of QUIC's time based loss detection"
ER is listed in Ack-based detection in section 3.2.2.



Section 4
"QUIC's congestion control is based on TCP NewReno[RFC6582] congestion
   control to determine the congestion window and pacing rate."
Looks like pacing is addressed in the editors copy already and this sentence has been modified to not reference pacing as part of NewReno.

Section 4.6 in Editor's copy
"It is RECOMMENDED that a sender pace sending of all data"
I think we will have to mark it a MUST for safety. Since QUIC increases the congestion window by the number of  acknowledged bytes when each ack is processed and there is no ABC limit, not having pacing will cause burst losses. OR maybe we should just call out the problems by not pacing. Thoughts?

Section 4.3
"from TCP's definition of recovery ending when the lost packet that
   started recovery is acknowledged"
I saw the discussion on this in the mailing list. I think we should call out that while QUIC can as a result drop the cwnd multiple times if there is sustained loss, this definition of recovery exit primarily serves to ensure the cwnd is dropped only once per RTT. Is there any other side effect I am missing?


Section 4.6
"In order to fairly compete with flows that are
   not pacing, it is recommended to not pace the first 10 sent packets
   when exiting quiescence."
This is now not in the Editor's copy so I assume this is not required. Is there a quiescence check in gQUIC? Does Linux TCP consistently pace throughout?


Section 4.7.1.
"kDefaultMss"
I think we should just have kMss and have that used everywhere and initialize kMss to max(kDefaultMss, "PMTU discovery MSS").