Re: questions on draft-ietf-quic-recovery

Ian Swett <ianswett@google.com> Wed, 31 October 2018 01:34 UTC

Return-Path: <ianswett@google.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 239D2130DE9 for <quic@ietfa.amsl.com>; Tue, 30 Oct 2018 18:34:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.499
X-Spam-Level:
X-Spam-Status: No, score=-17.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, NORMAL_HTTP_TO_IP=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iXn1M49BVZ-b for <quic@ietfa.amsl.com>; Tue, 30 Oct 2018 18:34:54 -0700 (PDT)
Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 61057130DD8 for <quic@ietf.org>; Tue, 30 Oct 2018 18:34:54 -0700 (PDT)
Received: by mail-wr1-x42b.google.com with SMTP id r10-v6so14666204wrv.6 for <quic@ietf.org>; Tue, 30 Oct 2018 18:34:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=o3nP+gVvuUqsAJGx7M278VJx0xJyMk51eHy0bl8JvJA=; b=jxYf3SuAkeaiQ2jbFJKzPrppJ36SYbCwWE57ccEWf/ItiEr3ezLD7pY1OtYAyNThc+ X2dalCF/W/quNyduagIRlwS00hMrtierprwKqb/jX4hDwmLlHit6uPH279KeB+pDFQIJ vHYTJnvy7FBl7FvEJdZ6aVRYJif/Fo4qPeH8PN1/X27C6oj9gc1vm0gn5Tbna9Qoozdq HFApuw96EcaRQaxmSg9dSpxlcvaGvyG6k+dPnlOunlCHCOe5WOJCVcADczAQ+t/Q3WVt DqGDn9yOpJ3RnVkeaBOxoZxJxC+TxXiUOxYisCnK9QLe3olXm646QDXz5EmlbcwVo40S zrEQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=o3nP+gVvuUqsAJGx7M278VJx0xJyMk51eHy0bl8JvJA=; b=Hc105pdhNMeep0/Q9kBv421WcV6Q4HhRK+b2Rzy8Sr4hn92qBSa9FZ9P/VbqgWz28n eZc39wa0xDlx+Dh4nHwP9fRFGdlqh7WrvyLhNP8jnccn+R5PaqKBeiPHz/tg2pIVAiSA 4HhrWgMkq85uIr/c20mTpKqDhdNUqOhP4XygKDPQohnjIg4k5vm3jbailLkMZIpbDFly Ccv5HrNmdbtRG+vqjWvSgFfSLxZFgmr/M/WsaPkudxbtgpWDsfUCCkwyT8urZkNb4sLz tzBfoJvqK3p5YmFG/8bLQOFFtZvMPXRzP5tlH5nv7lbwM+njyMCSsoRJ8opoTldIcPXL ugjQ==
X-Gm-Message-State: AGRZ1gK0jiPI7ROnLanvcu2JmJVhW7f161FF97Oy1JV2YW7hLk1B7ikC XMywim/vohUiY4T2KFCkRT53Xq+cBWN7vEo8kRuS6A==
X-Google-Smtp-Source: AJdET5dN0fjpo2wH1DbqTXzMxHnDWKbH22mas17Wry22rROSSea4ivvB8/4U9o4ezwZBs+N1PyQJBX9ww/EDQu9Qse4=
X-Received: by 2002:adf:812a:: with SMTP id 39-v6mr811915wrm.84.1540949692646; Tue, 30 Oct 2018 18:34:52 -0700 (PDT)
MIME-Version: 1.0
References: <CAO249yeMcawY6zZE-64QFO+L6NZhFLnvjXw9dnjn0qsqtKW9kw@mail.gmail.com>
In-Reply-To: <CAO249yeMcawY6zZE-64QFO+L6NZhFLnvjXw9dnjn0qsqtKW9kw@mail.gmail.com>
From: Ian Swett <ianswett@google.com>
Date: Tue, 30 Oct 2018 21:34:39 -0400
Message-ID: <CAKcm_gP-CrgUYEU4Vjdeo2sjpsZvebE-NLXGAL97ckntVwvhOQ@mail.gmail.com>
Subject: Re: questions on draft-ietf-quic-recovery
To: nishida@sfc.wide.ad.jp
Cc: IETF QUIC WG <quic@ietf.org>, tcpm@ietf.org, Jana Iyengar <jri.ietf@gmail.com>
Content-Type: multipart/alternative; boundary="0000000000000c6fe605797c4fcf"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/n4i5xMmIxmBvnakWIqmBQHFY5ko>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 31 Oct 2018 01:34:57 -0000

Thanks for your careful reading, answers inline.
On Mon, Oct 29, 2018 at 1:03 PM Yoshifumi Nishida <nishida@sfc.wide.ad.jp>
wrote:

> Hello,
>
> At Bangkok, TCPM WG will have a session to discuss loss recovery and
> congestion control mechanism in QUIC and TCP. (11/6 Tuesday
> 11:20-12:20)
> I really appreciate Ian and Jana for the efforts.
>
> I might overlook something as I couldn't follow up all discussion in
> QUIC, but I am thinking that it might be good if we could get some
> comments on the following questions.
> I have listed several questions on the value of some parameters, but
> it doesn't mean I disagree with these values. I just would like to
> check where these values came from.
> Hope this will be useful in some ways.
>
>
> Section 4:
>    QUIC supports both ack and timer for loss detection.
>    But, it's not very clear whether an implementation can choose one
> of them or it should implement both and can use both at the same time.
> Can it be clarified?
>

Yes, there's an open Issue to clarify this.
https://github.com/quicwg/base-drafts/issues/1212

The outstanding PR is quite old, but has some helpful clarifications in it,
so I'm intending to migrate the best parts to a new PR.  Suggestions(or
PRs) welcome.


> Section 4.1.1:
>    For RTT sampling, implementations will need to maintain the
> transmission time for each sending packets even if they support only
> ack-based method.
>    But, this requirement might be hard for tiny devices. Wouldn't it
> limit the applicability of QUIC?
>

Good question.  Assuming a ~1200 byte packet and a 4 byte timestamp, that's
<0.5% of the total storage overhead, assuming the data being sent is
reliable.  That doesn't seem like too much overhead to me, but one could
certainly store less granular information if they were really constrained.
ie: packet numbers X to X+N all contain a certain sequence of bytes and
were sent at approximately a single time.  I believe such a storage
approach would make it more similar to TCP metadata, but you likely know
better than I do.


>
>    "the largest newly acked packet" might be confusing. packet that
> acks the largest packet number?
>

 That's the largest packet that was acknowledged in the current ack frame
being processed.  I re-read the text and I'm not sure how to improve it, so
feel free to send a PR.

>
> Section 4.2.1:
>    TCP RACK's reo_wnd is aiming to be 1/4 RTT. Why QUIC uses 1/8 RTT?
>

I ran some experiments that indicated 1/4 RTT was a slight latency
regression for web applications and 1/8 RTT still had fewer spurious
retransmits than the dupack threshold loss detection.  I also presented
some reordering data at maprg in Montreal that indicated 1/8 RTT was
typically enough.  Obviously different networks and workloads are
different, and I think the best approach is likely an adaptive one, but I
don't have an adaptive algorithm I'm confident enough in to write it up.
Suggestions definitely welcome.

>
> Section 4.2.2:
>    The early retransmit logic in QUIC seems to be different from TCP's one.
>    It doesn't look ack-based as it relies on timer. (I don't say this
> is bad, but it's a bit strange to categorize it as an ack-based
> method)
>    TCP's early retransmit will be triggered only when the amount of
> outstanding data is less than 4*SMSS, but this draft doesn't mention
> about the condition. Is there any reason for it?
>    when timer-based loss detection is used, it this logic still
> available? or it won't be used with timer-based loss detection? We
> might need clarifications here.
>

The timer is taken from Linux, as mentioned at the end of 4.2.2.  It's
categorized as ACK based because it is only armed when a packet with a
larger packet number than the one being lost has been acknowledged.  One
could argue the same issue for time based loss detection, which also needs
a timer.

Instead of 4*SMSS, we specify kReorderingThreshold, since it seemed like
the core issue being fixed was that there was no other mechanism for
quickly declaring these packets lost, and if one did implement adaptive
reordering tolerance, one threshold wouldn't be out of sync with the other.

The intent is that like RACK, timer based loss detection subsumes early
retransmit.  But yes, this text could use some further clarification,
including and not limited to items pointed out in #1212.


>
> Section 4.3.2:
>    TCP RACK's PTO is aiming to be 2 * SRTT. Why QUIC uses 1.5 * SRTT?
>

This arose because TCP uses 1.5 * SRTT + MaxAckDelay if there is only one
MSS outstanding.  QUIC is still arriving at the optimal algorithm for
sending ACKs and how to indicate that to the peer, so QUIC explicitly
specifies MaxAckDelay in the handshake and always includes the MaxAckDelay
in PTO.  We could go back to the approach of trying to determine whether a
fast ACK is expected and use 2 * SRTT in that case, if we thought it was
better, but this is a bit simpler.


>
> Section 4.3.3:
>    RFC5681 uses 1 SMSS as the size of loss window, while QUIC uses 2.
> I understand the motivation to reduce the possibility of consecutive
> RTO events. But, this may means when TCP and QUIC connections are
> under heavy congestion, QUIC connections will recover more quickly
> than TCP connections. We might want to think how much QUIC and TCP
> should be fair.
>

I have some experimental code to see if 2 packets makes a difference vs 1
packet.  If not, we can change QUIC to 1.  If it does make a difference,
then we should discuss what to recommend for both TCP and QUIC.


>    BTW, it might not be necessarily, but QUIC does not need the upper
> bound for RTO?
>

In practice, I'd expect a QUIC connection to idle timeout before it hit the
upper bound.  We could add one if you think it's useful, but it seemed
safer to continue with exponential backoff.

>
> Section 4.4:
>    why the default value of max_ack_delay is 25ms?
>

It was a chosen as a good general purpose value for modern networks.  25ms
means sending an ACK for every packet only if packets are evenly spaced and
the bandwidth is about 300kbits or lower if my math is correct.  That's
definitely on the lower end of internet speeds.  Additionally, if a client
knows they're on such a slow link(2g) they can use the MaxAckDelay
transport param to specify a larger ack delay.

Or, from 4.6:
"A shorter delayed ack time of 25ms was chosen because longer delayed

   acks can delay loss recovery and for the small number of connections
   where less than packet per 25ms is delivered, acking every packet is
   beneficial to congestion control and loss recovery."


> Section 4.5.7.3:
>    Sorry.. what is 1/4 RTT timer for early retransmit?
>

 Sorry, I can't seem to find the reference to a 1/4 RTT timer for early
retransmit.

>
> Section 5.8.1:
>    Why the recommended value is 1200 bytes?
>
>
That's from QUIC transport, but the intent is to make the minimum MTU
fairly large to avoid amplification attacks as well as provide enough room
for the TLS ClientHello.