Re: [tcpm] questions on draft-ietf-quic-recovery

Ian Swett <> Wed, 31 October 2018 01:34 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 70411130DD8 for <>; Tue, 30 Oct 2018 18:34:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -17.499
X-Spam-Status: No, score=-17.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, NORMAL_HTTP_TO_IP=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=unavailable autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id ozZHu9vBG7Dn for <>; Tue, 30 Oct 2018 18:34:55 -0700 (PDT)
Received: from ( [IPv6:2a00:1450:4864:20::435]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 61BBF130DDD for <>; Tue, 30 Oct 2018 18:34:54 -0700 (PDT)
Received: by with SMTP id x12-v6so14651837wrw.8 for <>; Tue, 30 Oct 2018 18:34:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=o3nP+gVvuUqsAJGx7M278VJx0xJyMk51eHy0bl8JvJA=; b=jxYf3SuAkeaiQ2jbFJKzPrppJ36SYbCwWE57ccEWf/ItiEr3ezLD7pY1OtYAyNThc+ X2dalCF/W/quNyduagIRlwS00hMrtierprwKqb/jX4hDwmLlHit6uPH279KeB+pDFQIJ vHYTJnvy7FBl7FvEJdZ6aVRYJif/Fo4qPeH8PN1/X27C6oj9gc1vm0gn5Tbna9Qoozdq HFApuw96EcaRQaxmSg9dSpxlcvaGvyG6k+dPnlOunlCHCOe5WOJCVcADczAQ+t/Q3WVt DqGDn9yOpJ3RnVkeaBOxoZxJxC+TxXiUOxYisCnK9QLe3olXm646QDXz5EmlbcwVo40S zrEQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=o3nP+gVvuUqsAJGx7M278VJx0xJyMk51eHy0bl8JvJA=; b=ikaAQP8ssRsP9KC+o9JcDDgRg/bJggY/lR2L0/Ze5Qs47y8T7BpMnMhMi5yL2eJiOr bAyq/2OL5oMpmmM76U8n2Ey5EDE0sUl7JVpYWgBtGC/ep3emzp9r8G/6DCU9jIrvvoQo DAZoqVRDsPkbVDPkSnqlmEANSDCedsxu2BdZrP9uhQk6EIVmCc9PQa+F1vbm/f2WruHV F2e14E57RZ31mBsTj2TyWP9FEbYSmHI4hJqG+3e66zmghe5pC4wub1rP6/bwUu+kXa2n UQikyY8o2nKhxJCopCnkJdGF0A05m7TQ3M8hbM/Go3Jzyecey3F8mWlLmbX+xWZGdZgs YhoQ==
X-Gm-Message-State: AGRZ1gLoJCAfsq5RIfzFldVZYTvLZznC/fYFabgK+brtgXy68E7IJ6pd P5exMO4kEJLFZMvgZE9ty6raSE7pydhIF0mHp9Z10A==
X-Google-Smtp-Source: AJdET5dN0fjpo2wH1DbqTXzMxHnDWKbH22mas17Wry22rROSSea4ivvB8/4U9o4ezwZBs+N1PyQJBX9ww/EDQu9Qse4=
X-Received: by 2002:adf:812a:: with SMTP id 39-v6mr811915wrm.84.1540949692646; Tue, 30 Oct 2018 18:34:52 -0700 (PDT)
MIME-Version: 1.0
References: <>
In-Reply-To: <>
From: Ian Swett <>
Date: Tue, 30 Oct 2018 21:34:39 -0400
Message-ID: <>
Cc: IETF QUIC WG <>,, Jana Iyengar <>
Content-Type: multipart/alternative; boundary="0000000000000c6fe605797c4fcf"
Archived-At: <>
Subject: Re: [tcpm] questions on draft-ietf-quic-recovery
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 31 Oct 2018 01:34:58 -0000

Thanks for your careful reading, answers inline.
On Mon, Oct 29, 2018 at 1:03 PM Yoshifumi Nishida <>

> Hello,
> At Bangkok, TCPM WG will have a session to discuss loss recovery and
> congestion control mechanism in QUIC and TCP. (11/6 Tuesday
> 11:20-12:20)
> I really appreciate Ian and Jana for the efforts.
> I might overlook something as I couldn't follow up all discussion in
> QUIC, but I am thinking that it might be good if we could get some
> comments on the following questions.
> I have listed several questions on the value of some parameters, but
> it doesn't mean I disagree with these values. I just would like to
> check where these values came from.
> Hope this will be useful in some ways.
> Section 4:
>    QUIC supports both ack and timer for loss detection.
>    But, it's not very clear whether an implementation can choose one
> of them or it should implement both and can use both at the same time.
> Can it be clarified?

Yes, there's an open Issue to clarify this.

The outstanding PR is quite old, but has some helpful clarifications in it,
so I'm intending to migrate the best parts to a new PR.  Suggestions(or
PRs) welcome.

> Section 4.1.1:
>    For RTT sampling, implementations will need to maintain the
> transmission time for each sending packets even if they support only
> ack-based method.
>    But, this requirement might be hard for tiny devices. Wouldn't it
> limit the applicability of QUIC?

Good question.  Assuming a ~1200 byte packet and a 4 byte timestamp, that's
<0.5% of the total storage overhead, assuming the data being sent is
reliable.  That doesn't seem like too much overhead to me, but one could
certainly store less granular information if they were really constrained.
ie: packet numbers X to X+N all contain a certain sequence of bytes and
were sent at approximately a single time.  I believe such a storage
approach would make it more similar to TCP metadata, but you likely know
better than I do.

>    "the largest newly acked packet" might be confusing. packet that
> acks the largest packet number?

 That's the largest packet that was acknowledged in the current ack frame
being processed.  I re-read the text and I'm not sure how to improve it, so
feel free to send a PR.

> Section 4.2.1:
>    TCP RACK's reo_wnd is aiming to be 1/4 RTT. Why QUIC uses 1/8 RTT?

I ran some experiments that indicated 1/4 RTT was a slight latency
regression for web applications and 1/8 RTT still had fewer spurious
retransmits than the dupack threshold loss detection.  I also presented
some reordering data at maprg in Montreal that indicated 1/8 RTT was
typically enough.  Obviously different networks and workloads are
different, and I think the best approach is likely an adaptive one, but I
don't have an adaptive algorithm I'm confident enough in to write it up.
Suggestions definitely welcome.

> Section 4.2.2:
>    The early retransmit logic in QUIC seems to be different from TCP's one.
>    It doesn't look ack-based as it relies on timer. (I don't say this
> is bad, but it's a bit strange to categorize it as an ack-based
> method)
>    TCP's early retransmit will be triggered only when the amount of
> outstanding data is less than 4*SMSS, but this draft doesn't mention
> about the condition. Is there any reason for it?
>    when timer-based loss detection is used, it this logic still
> available? or it won't be used with timer-based loss detection? We
> might need clarifications here.

The timer is taken from Linux, as mentioned at the end of 4.2.2.  It's
categorized as ACK based because it is only armed when a packet with a
larger packet number than the one being lost has been acknowledged.  One
could argue the same issue for time based loss detection, which also needs
a timer.

Instead of 4*SMSS, we specify kReorderingThreshold, since it seemed like
the core issue being fixed was that there was no other mechanism for
quickly declaring these packets lost, and if one did implement adaptive
reordering tolerance, one threshold wouldn't be out of sync with the other.

The intent is that like RACK, timer based loss detection subsumes early
retransmit.  But yes, this text could use some further clarification,
including and not limited to items pointed out in #1212.

> Section 4.3.2:
>    TCP RACK's PTO is aiming to be 2 * SRTT. Why QUIC uses 1.5 * SRTT?

This arose because TCP uses 1.5 * SRTT + MaxAckDelay if there is only one
MSS outstanding.  QUIC is still arriving at the optimal algorithm for
sending ACKs and how to indicate that to the peer, so QUIC explicitly
specifies MaxAckDelay in the handshake and always includes the MaxAckDelay
in PTO.  We could go back to the approach of trying to determine whether a
fast ACK is expected and use 2 * SRTT in that case, if we thought it was
better, but this is a bit simpler.

> Section 4.3.3:
>    RFC5681 uses 1 SMSS as the size of loss window, while QUIC uses 2.
> I understand the motivation to reduce the possibility of consecutive
> RTO events. But, this may means when TCP and QUIC connections are
> under heavy congestion, QUIC connections will recover more quickly
> than TCP connections. We might want to think how much QUIC and TCP
> should be fair.

I have some experimental code to see if 2 packets makes a difference vs 1
packet.  If not, we can change QUIC to 1.  If it does make a difference,
then we should discuss what to recommend for both TCP and QUIC.

>    BTW, it might not be necessarily, but QUIC does not need the upper
> bound for RTO?

In practice, I'd expect a QUIC connection to idle timeout before it hit the
upper bound.  We could add one if you think it's useful, but it seemed
safer to continue with exponential backoff.

> Section 4.4:
>    why the default value of max_ack_delay is 25ms?

It was a chosen as a good general purpose value for modern networks.  25ms
means sending an ACK for every packet only if packets are evenly spaced and
the bandwidth is about 300kbits or lower if my math is correct.  That's
definitely on the lower end of internet speeds.  Additionally, if a client
knows they're on such a slow link(2g) they can use the MaxAckDelay
transport param to specify a larger ack delay.

Or, from 4.6:
"A shorter delayed ack time of 25ms was chosen because longer delayed

   acks can delay loss recovery and for the small number of connections
   where less than packet per 25ms is delivered, acking every packet is
   beneficial to congestion control and loss recovery."

> Section
>    Sorry.. what is 1/4 RTT timer for early retransmit?

 Sorry, I can't seem to find the reference to a 1/4 RTT timer for early

> Section 5.8.1:
>    Why the recommended value is 1200 bytes?
That's from QUIC transport, but the intent is to make the minimum MTU
fairly large to avoid amplification attacks as well as provide enough room
for the TLS ClientHello.