Re: [quicwg/base-drafts] Simplify TLP and RTO into Probe Timeout (#2114)

ianswett <notifications@github.com> Tue, 11 December 2018 22:23 UTC

Return-Path: <noreply@github.com>
X-Original-To: quic-issues@ietfa.amsl.com
Delivered-To: quic-issues@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 57571130F58 for <quic-issues@ietfa.amsl.com>; Tue, 11 Dec 2018 14:23:03 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.46
X-Spam-Level:
X-Spam-Status: No, score=-9.46 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.46, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=github.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DXY9G58Q196p for <quic-issues@ietfa.amsl.com>; Tue, 11 Dec 2018 14:23:00 -0800 (PST)
Received: from out-11.smtp.github.com (out-11.smtp.github.com [192.30.254.194]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4DC7F12E036 for <quic-issues@ietf.org>; Tue, 11 Dec 2018 14:23:00 -0800 (PST)
Date: Tue, 11 Dec 2018 14:22:59 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com; s=pf2014; t=1544566979; bh=RHrQIvlz+RGf3NP5KLN9gtJukcEG1CYqqx4tSFJRyIE=; h=Date:From:Reply-To:To:Cc:In-Reply-To:References:Subject:List-ID: List-Archive:List-Post:List-Unsubscribe:From; b=u8iU5vNb89lSxCluR4PEHfyjdnP0cal1Vf+idyB3dNGqClwe1FhaDCxkpAEs7rbMh rtPlhGE2XNcGlslLaGmb0BVwuFHsZB2fjxyq1WkUu6bWcd+NMqjP4gKDj3l3NjYs+O Gc8tDgLxECphalMrpEPxJ/KM7hFeHCCXI/qFyX0k=
From: ianswett <notifications@github.com>
Reply-To: quicwg/base-drafts <reply+0166e4ab6dcfa07188ca62c0a62d52a031f7103cd2e6554292cf000000011827fac392a169ce173c5dcf@reply.github.com>
To: quicwg/base-drafts <base-drafts@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <quicwg/base-drafts/pull/2114/review/183923684@github.com>
In-Reply-To: <quicwg/base-drafts/pull/2114@github.com>
References: <quicwg/base-drafts/pull/2114@github.com>
Subject: Re: [quicwg/base-drafts] Simplify TLP and RTO into Probe Timeout (#2114)
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="--==_mimepart_5c1038c36caa0_7c7f3f8a17ad45b8170586"; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Precedence: list
X-GitHub-Sender: ianswett
X-GitHub-Recipient: quic-issues
X-GitHub-Reason: subscribed
X-Auto-Response-Suppress: All
X-GitHub-Recipient-Address: quic-issues@ietf.org
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic-issues/BrAXx7fs0JuWz2R_qWXOi_URx8E>
X-BeenThere: quic-issues@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Notification list for GitHub issues related to the QUIC WG <quic-issues.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic-issues>, <mailto:quic-issues-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic-issues/>
List-Post: <mailto:quic-issues@ietf.org>
List-Help: <mailto:quic-issues-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic-issues>, <mailto:quic-issues-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Dec 2018 22:23:04 -0000

ianswett commented on this pull request.

I like this change in general, but I think it makes sense to keep existing behavior more similar to the status quo and allow a sender to do a normal loss-style reduction if the first or second PTO are the ones acknowledged.

> @@ -342,7 +342,7 @@ lost, then a timer SHOULD be set for the remaining time.
 The RECOMMENDED time threshold (kTimeThreshold), expressed as a round-trip time
 multiplier, is 9/8.
 
-Using max(SRTT, latest_RTT) protects from the two following cases:
+Using max(smoothed_rtt, latest_rtt) protects from the two following cases:

You still use max(SRTT, latest_RTT) on 338.

>  
-A packet sent at the tail is particularly vulnerable to slow loss detection,
-since acks of subsequent packets are needed to trigger ack-based detection. To
-ameliorate this weakness of tail packets, the sender schedules a timer when the
-last ack-eliciting packet before quiescence is transmitted. Upon timeout,
-a Tail Loss Probe (TLP) packet is sent to evoke an acknowledgement from the
-receiver.
+A Probe Timeout (PTO) triggers a probe packet when ack-eliciting data is in
+flight but an acknowledgement does not seem forthcoming.  A PTO enables a

"does not seem forthcoming" is non-specific.
```suggestion
flight but an acknowledgement is not received within the expected time.  A PTO enables a
```

>  
-* PTO SHOULD be scheduled for max(1.5*SRTT+MaxAckDelay, kMinTLPTimeout)
+When the final ack-eliciting packet before quiescence is transmitted, the sender

Final is an optimization and we don't define quiescence
```suggestion
When an ack-eliciting packet is sent, the sender
```

> @@ -577,21 +560,18 @@ kTimeThreshold:
   considers a packet lost. Specified as an RTT multiplier. The RECOMMENDED
   value is 9/8.
 
-kMinTLPTimeout:
-: Minimum time in the future a tail loss probe timer may be set for.
-  The RECOMMENDED value is 10ms.
+kGranularity:

It feels odd to define this here, but use it three times up above.  I'd suggest discussing the idea that you're using clock granularity above where you first use it.

>  
-A PTO value of at least 1.5*SRTT ensures that the ACK is overdue.  The 1.5 is
-based on {{?TLP}}, but implementations MAY experiment with other constants.
+The PTO period is the amount of time that a sender ought to wait for an

```suggestion
The PTO period is the amount of time that a sender SHOULD wait for an
```

>  
-A PTO value of at least 1.5*SRTT ensures that the ACK is overdue.  The 1.5 is
-based on {{?TLP}}, but implementations MAY experiment with other constants.
+The PTO period is the amount of time that a sender ought to wait for an
+acknowledgement for a sent packet to be received.  This time period includes the

```suggestion
acknowledgement of a sent packet.  This time period includes the
```

>  
-A PTO value of at least 1.5*SRTT ensures that the ACK is overdue.  The 1.5 is
-based on {{?TLP}}, but implementations MAY experiment with other constants.
+The PTO period is the amount of time that a sender ought to wait for an
+acknowledgement for a sent packet to be received.  This time period includes the
+estimated network roundtrip-time (smoothed_rtt), the variance in the estimate

Using variance here is potentially confusing, since it's not true variance and it's 4*rttvar.

I'm having a hard time coming up with suggestions, but I'll think about it.

>  
-To reduce latency, it is RECOMMENDED that the sender set and allow the TLP timer
-to fire twice before setting an RTO timer. In other words, when the TLP timer
-expires the first time, a TLP packet is sent, and it is RECOMMENDED that the TLP
-timer be scheduled for a second time. When the TLP timer expires the second
-time, a second TLP packet is sent, and an RTO timer SHOULD be scheduled {{rto}}.
+There is no requirement on the clock granularity. If the PTO computation results
+in a value of zero, a sender MUST set the PTO value to kGranularity, to avoid

How about "The PTO must be set to at least kGranularity,"

>  
-A TLP packet SHOULD carry new data when possible. If new data is unavailable or
-new data cannot be sent due to flow control, a TLP packet MAY retransmit
-unacknowledged data to potentially reduce recovery time. Since a TLP timer is
-used to send a probe into the network prior to establishing any packet loss,
-prior unacknowledged packets SHOULD NOT be marked as lost when a TLP timer
-expires.
+A PTO timer is set on an ack-eliciting tail packet.  A sender may not know that

Drop this first sentence and move the second sentence up with "When the final ack-eliciting packet"

>  
-Similar to TCP {{?RFC6298}}, the RTO period is set based on the following
-conditions:
+When a PTO timer expires, the sender MUST send one ack-eliciting packet as a
+probe. A sender MAY send up to two ack-eliciting packets, to avoid an expensive
+consecutive PTO expiration due to packet loss.

```suggestion
consecutive PTO expiration due to a single packet loss.
```

>  
-* When an RTO timer expires, the RTO period is doubled.
+Probe packets sent on a PTO MUST be ack-eliciting.  A probe packet SHOULD carry
+new data when possible.  Implementers MAY use alternate strategies for
+determining the content of probe packets.  An implementation could use new data

>From "An implementation could" on, this paragraph seems quite speculative, and the first sentence is redundant to the above setence "A probe packet SHOULD carry new data when possible."

>  
-A packet sent on an RTO timer MUST NOT be blocked by the sender's congestion
-controller. A sender MUST however count these packets as being in flight, since
-this packet adds network load without establishing packet loss.
+If an ACK frame is received that newly acknowledges only those packets that were
+sent after a PTO timer expiration, all unacknowledged packets with lower packet
+numbers MUST be marked as lost.

This will happen naturally due to loss detection I believe.

>  
-Acknowledgement or loss of tail loss probes are treated like any other packet.
+Probe packets MUST NOT be blocked by the congestion controller.  A sender MUST
+however count these packets as being additionally in flight, since these packets
+adds network load without establishing packet loss.  Note that sending probe
+packets might cause the sender's estimated bytes in flight to exceed the

```suggestion
packets might cause the sender's bytes in flight to exceed the
```

>  
-Acknowledgement or loss of tail loss probes are treated like any other packet.
+Probe packets MUST NOT be blocked by the congestion controller.  A sender MUST
+however count these packets as being additionally in flight, since these packets
+adds network load without establishing packet loss.  Note that sending probe
+packets might cause the sender's estimated bytes in flight to exceed the
+sender's congestion window until an acknowledgement is received that establishes

```suggestion
congestion window until an acknowledgement is received that establishes
```

>  
-Acknowledgement or loss of tail loss probes are treated like any other packet.
+Probe packets MUST NOT be blocked by the congestion controller.  A sender MUST
+however count these packets as being additionally in flight, since these packets
+adds network load without establishing packet loss.  Note that sending probe
+packets might cause the sender's estimated bytes in flight to exceed the
+sender's congestion window until an acknowledgement is received that establishes

Also, I think this would read better as "until packets are acknowledged or lost."

>  
-## Retransmission Timeout
+A PTO expiration is classified as spurious or valid when an ACK frame is
+received that newly acknowledges packets in flight, see {{pto-loss}}.  On a
+valid PTO, the congestion window MUST be reduced to the minimum congestion
+window and slow start is re-entered.

Agreed, this is a potentially substantial change.

>  
 ~~~
-   OnRetransmissionTimeoutVerified(packet_number)
+   OnProbeTimeoutVerified(packet_number)

I'd suggest we keep a "you can not do the CWND reduction on the first or 2nd PTO" to keep behavior more similar to what QUIC has now.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/quicwg/base-drafts/pull/2114#pullrequestreview-183923684