Re: [tcpm] WGLC for draft-ietf-tcpm-rack-08

Yuchung Cheng <> Tue, 14 July 2020 00:18 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id BBEB43A0B8D for <>; Mon, 13 Jul 2020 17:18:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -17.6
X-Spam-Status: No, score=-17.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id LK7HCz9Fkc1b for <>; Mon, 13 Jul 2020 17:18:30 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::944]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id B693B3A0B6E for <>; Mon, 13 Jul 2020 17:18:30 -0700 (PDT)
Received: by with SMTP id u33so2608782uad.9 for <>; Mon, 13 Jul 2020 17:18:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=raIrvJotD2pREoClxhZ5gGouNQPLF8nClQ01XNYVfmE=; b=nwedkW76C9pdB77v5VsOixR3JEblc6PuJS8WviJvu36GT3WZ7ML0Q19xIm1WsLj2tY 0kdAhirBNGzwZQKndhmwS7ZIvT9/NiV05L1dE1/PqSKZZS7lBLKRmCeoYnnI/UPMPg0w M3Zky25i5hnLa4y5impvkjXyNp6qY8KkYfc3ceKmdSAE4U6KcEZNkG/0QXEOehc9mQkw 7d5AzJB7ZrR01QoAvOS1Nqh0ijgtPF40WeoFTs/1Weh5WxZv2G6X5Rl4IPXkuqwUpn1D GvTtMbTxMCvtcuPZiU+AwblkVGqmDimzaMPKytfDoebPjthW2bFgEqEoi4WCz4cVwkxr iB0w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=raIrvJotD2pREoClxhZ5gGouNQPLF8nClQ01XNYVfmE=; b=CR8gKaaIJEeUpJh5e6Gss2M7r1dHm7jotDSiH2kQpjjGHw1pLc5mF3PowjprbnfsiE p8+uGc2vK7w37Fv8Ne0pxKZjpojS+aEdZ3Mc6tOkeXHWqEWR2i37ZZRVUZ+E/JcKHDqB ybJ255AIXdjA9HpmaEPaKW9buF1XolUQ3Wm5lS3hBLpCHf0CKJO/xKWC8pWYtZBrH5TH uHTYh/Q1QgNbMsY1ofR1C88LoT56JGZLwVHtNxzUi8+AmcOPAHL07a3rObjte8/8XDE+ b7GyonK43lPAlsFyfA1qGHig7LVJgvXJ14rSpwcteFmeTXX9Jf6Na/HcQmuDVUFt3iCb /prg==
X-Gm-Message-State: AOAM532oOmJUeF37NQf522bo14TQ2P9ilMJm3CRutR3hnfNXm7AkTJmz Jm33dkN/2gY1CPsjJt57RmVHSY8h5fOBuVVXOo4wGx+fcWQ=
X-Google-Smtp-Source: ABdhPJzfbilT3eSRHOk1XEUaNxa00FHre4DKfv04dzX5ERER+W+QGfPcR1cjVFWUSmgJafcCNeRrT10WdpzVTU/xYME=
X-Received: by 2002:ab0:5a4b:: with SMTP id m11mr1620086uad.112.1594685909192; Mon, 13 Jul 2020 17:18:29 -0700 (PDT)
MIME-Version: 1.0
References: <> <>
In-Reply-To: <>
From: Yuchung Cheng <>
Date: Mon, 13 Jul 2020 17:17:53 -0700
Message-ID: <>
To: Mirja Kuehlewind <>
Cc: Michael Tuexen <>, tcpm IETF list <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [tcpm] WGLC for draft-ietf-tcpm-rack-08
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 14 Jul 2020 00:18:34 -0000

On Mon, Apr 6, 2020 at 5:41 AM Mirja Kuehlewind
<> wrote:
> Hi all,
> I reviewed this document and I think content-wise this is ready to move on (just some small questions below).
Thank you for the review. Sorry it took so long to respond. We have
made lots of changes based on your and other comments (nearly a

> I would like to see another editorial pass on the document. There are more detail comments below but I think the main points are a) provide a more detailed algorithm overview section (either in section 3 or beginning of section 7 to make it easier to understand the steps in the subsections in 7 and b) have an own main section for TLP and introduce parameters used for TLP also in section 6 and mention TLP in the intro already. It's still obvious that TLP was squeezed in later on....

Thank you very much. We rewrote the first four sections so that
RACK-TLP are an integrated design. We also add a protocol diagram to

> Further I think the security considerations could be more extensive: it could a) mention again some of the reordering consideration in section 4 (or maybe some of the text can be moved there entirely to not break the reading flow so much at the beginning of the doc) and b) say more about the impact of this algorithm on spurious retransmission (I guess it can both increase or decrease the number of spurious retransmissions in certain scenarios) and c) mention again the impact on congestion control as described in section 8.5 that a window reduction can happen both earlier or later in different scenarios.

We are not sure about the prospects of reordering consideration,
congestion control, and spurious retransmission in the context of
security. We do added this part
"   RACK-TLP algorithm behavior is based on information conveyed in SACK
   options, so it has security considerations similar to those described
   in the Security Considerations section of [RFC6675].

> Here are my technical questions/comments on section 7.2:
> For this sentence:
> "If no reordering has been observed, RACK uses
>    RACK.reo_wnd of 0 during loss recovery, in order to retransmit
>    quickly, or when the number of DUPACKs exceeds the classic DUPACK
>    threshold."
> And the respective pseudo code here:
>        If RACK.reord is FALSE:
>            If in loss recovery:  /* If in fast or timeout recovery */
>                RACK.reo_wnd = 0
>                Return
>            Else if RACK.pkts_sacked >= RACK.dupthresh:
>                RACK.reo_wnd = 0
>                Return
> My understanding here is that the RACK-based time threshold should not be used if in recovery or if no reordering is detected, right?

Correct. Zero RACK reordering window  essentially makes retransmit
immediate w/o any further reordering accommodation.
> Why is that realized by setting the RACK.reo_wnd to zero rather than implementing this logic in RACK_detect_loss()? I think the difference would be that you don't reset RACK.reo_wnd but keep the previous state about adjustments. But wouldn't that be good/better?

> My point it that the way it is described now confused me a bit and I'm looking for a way to better describe that part...

To reduce confusion we move the RACK.reo_wnd computation inside
RACK_detect_loss. RACK.reo_wnd is adjusted every ACK. Does the new
pseudocode help?

> Also why is RACK.dupthresh a separate parameter in the RACK name space instead of using the "existing" one (as you do with Packet. xmit_ts)? Further this is actually not a parameter that is adjusted by the algorithm but a configured constant. It be helpful for the reader to indicate this in the naming, e.g. call it DUPTHRESH instead. In general I'm actually not sure that using the RACK name space in this document is really helpful.

We revised and replaced all RACK.dupthresh with RFC6657 ‘DupThresh’ for clarity.

> And another unrelated question but on the same part of the algorithm: where does the number 16 come from for the number of recovery cycles to keep any adaptation?

Good question. We added the rationale in the draft.

If the reordering is temporary then a large adapted reordering window
would slow the loss recovery after the reordering disappears.
Therefore the inflated RACK.reo_wnd would persist for 16 loss
recoveries and after which it resets to its starting value, min_RTT /
4. The downside of resetting the reordering window is false loss
recovery if the reordering remains high. The rationale is to bound
such false recoveries up to once every 16 recoveries (less than 7%).

> Then my more detailed (small and larger) editorial comments:
> 1) In the intro you use  a couple of times "we" which is maybe less common in RFCs as an RFC is a product of a wg not only of the author group. Maybe you can rephrase this.

Good idea. Replaced/remove all “we” words

> 2) also in the intro:
> "The goal of RACK is to solve all the problems..."
> Maybe
> "The goal of RACK is to address all the problems..."
> 3) Section 3 provides rather a high level overview. It would be nice to get more of an overview about the algorithm itself, e.g. explaining the different step described in section 7 on a high level. Or maybe you could add alternatively more text at the beginning of section 7 explaining how the different subsections "fit together". Actually doing both would also not hurt as some redundancy where you go step by step in more details is often really helpful for the reader.

Good idea. We added algorithm diagram in the new ‘RACK-TLP high-level
design’ section

> 4) Section 4 I first thought that some of this text could actually go into the security considerations, however, I guess there are also some requirement here, so not sure if that could be splitted up somehow. I think the problem I have here is that is breaks quite a bit the reading flow between the overview section and the actual algorithm part.

We rewrote section 4. Hopefully it’s more clear now.

> 5) I find the naming of both RACK.reo_wnd and RACK.reo_wnd_incr a bit confusing, as RACK.reo_wnd is a time  frame (and I would expect a packet window based on the naming, given we have the congestion window or the received window) and RACK.reo_wnd_incr is a multiplier (and I would expect some value that would be added when "incr" is used)

changed to RACK.reo_wnd_incr

> 6) 7.2:
> "   Sometimes the timestamps of RACK.Packet and Packet could carry the
>    same transmit timestamps due to clock granularity or segmentation
>    offloading (i.e. the two packets were handed to the NIC as a single
>    unit).  In that case the sequence numbers of RACK.end_seq and
>    Packet.end_seq are compared to break the tie."
> The textual description is not fully clear to me (pseudo code is fine). I guess this text is in the context of update RACK.Packet itself, right? Maybe that the missing part that should be spelled out.

Sorry for the unclear text. We added a goal in the step and remove the
offload text. The tie breaker is not necessarily due to segmentation

> 7) I would recommend to have section 7.3. to 7.6 in their own new main section as it is an supplemental but separate algorithm. Further I would also recommend to mention Tail Loss Probe also in the abstract and intro.

Great suggestions. We have TLP in its whole section, and integrates
them from the beginning (starting at abstract)

> 8) Section 7.5: At least SRTT was already used earlier on in the doc, so maybe just add all these definitions to section 6.1?

Added SRTT in the

> 9) Section  8.1:
> " The examples above show that RACK is particularly useful when the
>    sender is limited by the application, which is common for
>    interactive, request/response traffic."
> I think that is actually an important point to make in the introduction of the document.

Agreed. Now mentioned in

> 10) Also section 8.2:
> "   RACK can easily and optionally support the conventional approach in
>    [RFC6675][RFC5681] by resetting the reordering window to zero when
>    the threshold is met.  Note that this approach differs slightly from
>    [RFC6675] which considers a packet lost when at least DupThresh
>    higher-sequence packets are SACKed. ..."
> Also this information would be really helpful in the overview section already.

We decided to have a dedicated section about reordering window value
selection. It was scattered all over the document making readers hard
to puzzle up the policy. Hopefully this section
makes it clear how RACK reo_wnd can be used to support 3-dupack rule

> 11) Also on both sections 8.1. and 8.2, I find the section titles not very well suitable. I would maybe suggest to rearrange the content a bit and have a section "Examples where RACK (and TLP) is beneficial" and another section on "Implementation considerations". At least for  the examples section I would recommend to use an own subsection.

We now provide the two more examples early on in the Design section.

> 12) Maybe rename section 8.3 to "Reasoning for basic reordering window size" (instead of " Adjusting the reordering window"...?
All reordering design rationale / policy is now in to
avoid fragmented design reasonings.

> 13) Section 8.4:
> " Furthermore, RACK naturally works well with Tail Loss Probe [TLP]"
> This reference should be removed as content was integrated into this document. Maybe the whole paragraph should be rephrased, removed entirely, or moved into the TLP section.

agreed. removed

> 14) Section 8.5.1:
> " Without RACK, the sender would time out, ..."
> I suggest to say "Without RACK and TLP,..." as TLP is described as a separate optional algorithm but TLP is the important bit here. Also If you decide to have a separate example section, I would suggest to move this example there as well.
removed this as RACK-TLP is described as one united algorithm

> 15) Should section 9 stay in the final RFC? If so, I recommend to move it into the appendix. However, it would also be nice to also provide results on the number of retransmissions (have they increased with RACK and TLP?). Alternatively I suggest removing this section and point to some paper maybe?
Agreed. removed. There wasn't a research paper for RACK-TLP but we
probably should have published one :-)

> 16) I think the security considerations could be more extensive, covering impact on reordering, spurious retransmission, and congestion control, as I said at the top of this mail.
We are not sure if these are tied to security, or maybe there are
specific issues we are missing?

> Thanks!
> Mirja
> On 31.03.20, 00:28, "tcpm on behalf of Michael Tuexen" < on behalf of> wrote:
>     Dear all,
>     just a quick reminder:
>     We are currently running a WGLC for TCP RACK until Tuesday, April 7th 2020.
>     Please send any comments, including indications to support this document,
>     to the TCMP mailing list by then.
>     The ID is available at
>     Best regards
>     Michael
> _______________________________________________
> tcpm mailing list