Re: [tcpm] A review for draft-ietf-tcpm-rack-09

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Thu, 20 August 2020 07:28 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2198B3A096B for <tcpm@ietfa.amsl.com>; Thu, 20 Aug 2020 00:28:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.835
X-Spam-Level:
X-Spam-Status: No, score=-2.835 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.949, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GSBYEhNoK7_k for <tcpm@ietfa.amsl.com>; Thu, 20 Aug 2020 00:28:37 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [137.50.19.135]) by ietfa.amsl.com (Postfix) with ESMTP id 7FF0F3A08B6 for <tcpm@ietf.org>; Thu, 20 Aug 2020 00:28:36 -0700 (PDT)
Received: from Gs-MacBook-Pro.lan (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 936841B000DF; Thu, 20 Aug 2020 08:28:32 +0100 (BST)
To: Yuchung Cheng <ycheng@google.com>
Cc: tcpm IETF list <tcpm@ietf.org>
References: <CAK6E8=d512Uvz-m37pkSg5Zxoq7Unvsf9rE8c2Kz0D-O4eQjng@mail.gmail.com> <C8CD0CD8-1364-4387-93AE-D9C2C6F7FF72@erg.abdn.ac.uk> <CAK6E8=duEmVz-m0K=f3ZUAu5BL3pnH1ENCo-VjZ+7zzuuHcSog@mail.gmail.com> <227bfb55-a2b4-37f0-9802-ddc76b22fa91@erg.abdn.ac.uk> <CAK6E8=emWx=OXc6OWbGsnsq2raAadd3QnY1GhtYgm-kBJJnVZw@mail.gmail.com> <da2a131d-a25d-5299-da65-9d4c5db3c53f@erg.abdn.ac.uk> <CAK6E8=eiJ35b_jnf19+rEvCOuw1YeHCAkGFH2zjy+aQCb7uH6g@mail.gmail.com>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <c767bc7a-fc1d-2977-8fe1-7b50577af51a@erg.abdn.ac.uk>
Date: Thu, 20 Aug 2020 08:28:31 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.11.0
MIME-Version: 1.0
In-Reply-To: <CAK6E8=eiJ35b_jnf19+rEvCOuw1YeHCAkGFH2zjy+aQCb7uH6g@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------30A4A919E9279A13245C1CFE"
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/frYR-o2E406s-uHD16kmtjbuMCY>
Subject: Re: [tcpm] A review for draft-ietf-tcpm-rack-09
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Aug 2020 07:28:41 -0000

Please see again below, it looks like this answered my questions.

On 20/08/2020 02:17, Yuchung Cheng wrote:
>
>
> On Wed, Aug 19, 2020 at 12:27 AM Gorry Fairhurst <gorry@erg.abdn.ac.uk 
> <mailto:gorry@erg.abdn.ac.uk>> wrote:
>
>     See below.
>
>     On 18/08/2020 21:28, Yuchung Cheng wrote:
>>
>>
>>     On Tue, Aug 18, 2020 at 6:42 AM Gorry Fairhurst
>>     <gorry@erg.abdn.ac.uk <mailto:gorry@erg.abdn.ac.uk>> wrote:
>>
>>         I have read draft-ietf-tcpm-rack-09 in preparation for TCPM
>>         to publish
>>         this document, and do have some comments. I note that this
>>         new rev is
>>         much easier to read this time, so thanks for the significant
>>         work to
>>         produce a clear spec, and I see also that you have already
>>         addressed my
>>         major concerns - thanks. (I'll separately catch a set of
>>         minor editorial
>>         notes on this rev).
>>
>>     Thank you for the review!
>>
>>
>>         (1) I don’t understand this clause:
>>         “  3.  The RACK reordering window SHOULD leverage that to
>>         adaptively
>>         estimate the duration of reordering events, if the receiver uses
>>         Duplicate Selective Acknowledgement (DSACK) [RFC2883].”
>>         - What does “leverage that” actually mean?
>>         - Might it mean something like: can be increased if the
>>         sender receives
>>         a Duplicate Selective Acknowledgement (DSACK) [RFC2883], that
>>         suggests
>>         the window is too small, and has resulted in spurious
>>         retransmission." …
>>         or does it mean something different?
>>
>>     It means exactly what you said :-)
>>
>>     Here is the latest rev based on Theresa' suggestion:
>>     "The RACK reordering window SHOULD adaptively increase if the
>>     sender receives the Duplicate Selective Acknowledgement (DSACK)
>>     [RFC2883], suggesting the window is too small which causes
>>     spurious retransmission."
>>
>>     Let me know if that's better?
>>
>     WFM.
>>
>>
>>         (2) I did not understand this part in Section 3.3:
>>         “However, the fact that the initial reordering window is low,
>>         and the
>>             reordering window's adaptive growth is bounded, means
>>         that there will
>>             continue to be a cost to reordering to disincentivize
>>         excessive
>>             network reordering over highly disjoint paths.  For such
>>         networks
>>             there are good alternative solutions, such as MPTCP.“
>>         - Is this intended to read as /reordering that
>>         disincentivizes excessive/
>>         - If that was intended, the spec appears to be for a single
>>         path, MPTCP
>>         would still view this path with reordering as a single path,
>>         so I do not
>>         understand how MPTCP helps unless the awareness of the
>>         disjoint paths is
>>         somehow known at the endpoints? Please explain.
>>
>>         - (You’d need a REF for MPTCP if you keep this senetnce and the
>>         explanation).
>>
>>     Yes that's the intention. Will removing the last sentence (about
>>     MPTCP) help?
>>
>     That would be an easy fix that would WFM.
>
>>
>>         (3)
>>           Question in Section 7.4.2.
>>         /If the TLP
>>             sender does not receive such an indication, then it
>>         SHOULD assume
>>             that either the original data segment or the TLP
>>         retransmission were
>>             lost, for congestion control purposes./
>>         - Why is this not a MUST?
>>         - Under what conditions would it be safe to ignore a SHOULD?
>>
>>     We use SHOULD because the ACK could be lost instead of the data,
>>     so if the sender has some mechanism to detect ACK losses well, it
>>     may be safe not to assume data was lost.
>>
>>     But I agree this is over-thinking the corner cases so how about
>>
>>     If the TLP sender does not receive such an indication, then it
>>     *MUST* assume that either the original data segment, or the TLP
>>     retransmission were lost*, or their ACKs* are lost for congestion
>>     control purposes.
>>
>     I think the new text reads as safer to me.
>
>>
>>         (4) in Section 6.2. I was also left with a potential concern
>>         about
>>         “min_RTT” with respect to “a simple global minimum of all RTT
>>         measurements from the connection”.  This method results in a
>>         significant
>>         path change from low to high RTT exhibiting an invalid RTT.
>>         Even a
>>         simple windowed min-filtered estimate - mentioned here -
>>         would avoid
>>         this effect on reordering detection.
>>
>>         - I’d prefer adding a short sentence  so people know why the
>>         updates to
>>         min_RTT might be useful.
>>
>>     Good point. (some) windowed-filter is definitely prefered. how about:
>>
>>     The sender SHOULD tracka windowed min-filtered estimate of recent
>>     RTT measurements to adapt migrating to significant longer paths,
>>     compared to a simple global minimum of all RTT measurements.
>>
>>
>     I like that proposed text. I don't think teh method has to be
>     sophisticated, but some method seems the right thing to do.
>>
>>
>>         (5) In section 8.1, I expected more discussion of the potential
>>         disadvantages (even if the authors seem to suggest these are not
>>         significant) :
>>
>>         * I still was left without understanding whether there is an
>>         impact from
>>         a path with a varying RTT (perhaps one that uses a link layer
>>         retransmission or access technology). My feeling is that if
>>         there is
>>         excessive variation the method could go wrong when the RTT
>>         increases,
>>         but I wonder if this is usually captured by inflating the
>>         SRTT and that
>>         more excessive variation is not common. I think this case
>>         should be
>>         discussed briefly.
>>
>>     AFAIU the scenario you are referring is when the link-layer
>>     retransmitted packets took a much longer > current SRTT to be
>>     delivered.
>     Yes.
>>     Then RACK reordering window (and its SRTT-bound) requires DSACK
>>     and new RTT measurement to raise to capture this variation.
>>     Depending on the RTT variation, it may take multiple round trips
>>     to adapt to. The idea is for short flows, RACK may not be able to
>>     cope with it (and we welcome any simple idea to improve that).
>     Understood.
>>     For long flows, RACK should eventually adapt to these high
>>     varying RTT. Hence we've tested RACK with high varying degree of
>>     reordering (in time) in synthetic benchmarks. In production
>>     experiments, we have not found such high variation too common
>>     tho: the wireless radio does not alter its transmission rate too
>>     often or change its retry count dynamically. But of course, we
>>     try to avoid any mention of "empirical case frequency" :-)
>>
>>     It that addresses your question, we can add some words in the
>>     reordering design rationale about this.
>>
>     I think a few words would help - just so that if the corner case
>     is encountered people understand there was a tradeoff.
>
>
> Sure -- how about adding this right after the reordering window rules 
> in the 'reordering window adaptation section'
>
> ...
> 2. The RACK reordering window SHOULD adaptively increase if the sender 
> receives the Duplicate Selective Acknowledgement (DSACK) [RFC2883], 
> suggesting the window is too small which causes spurious retransmission.
> 3. The RACK reordering window MUST be bounded and this bound SHOULD be 
> SRTT.
>
> <new> *The rule 2 and 3 combined are required to adapt reordering 
> caused by extended link layer recovery time described earlier. Then 
> the reordering window (and its SRTT-bound) requires DSACK and new RTT 
> measurement to increase. Depending on the RTT variation, it may take 
> multiple round trips to adapt. *</new>

> For short flows, the low initial reordering window is key to recover 
> quickly by risking spurious retransmissions. The rationale is that 
> spurious retransmissions for short flows are not expected to produce 
> excessive network traffic additionally. For long flows the design 
> tolerates reordering within a round trip. This handles reordering 
> caused by path divergence in small time scales (reordering within the 
> round-trip time of the shortest path).
>
I think you have experience here, so I'd expect you to write the correct 
thing. From my side this looks like it addresses my comment.
>
>>
>>         * As I read it, RACK can also send more retransmissions after
>>         there has
>>         been loss, where it is making spurious retransmissions - from
>>         timer
>>         events, or as a result of multiple segments from a single TSO
>>         segment
>>         being retransmitted. While it seems to be argued these are not
>>         significant, it could none the less a disadvantage in some
>>         scenarios,
>>         and should be captured in section 8.1.
>>
>>     The first case spurious RACK timer expiration is possible. So we
>>     can append to the end of 8.1 with
>>     "Another disadvantage is the reordering timer may expire
>>     prematurely (like any other retransmission timer) to cause higher
>>     spurious retransmission especially if DSACK is not supported".
>>
>     That would be fine for me.
>>     However RACK should not cause extra spurious retransmission with
>>     TSO (vs non-TSO) as a loss detection algorithm. Perhaps some text
>>     in TSO was not clear on this?
>>
>>
>>         * Also there seem cases where RACK allows TCP to continue to
>>         increase
>>         the congestion window upon receiving ACKs after loss, making
>>         the sender
>>         more aggressive. This is noted in 8.3, but I think should be
>>         listed in
>>         the disadvantages in 8.1.
>>
>>     Section 8.3 clearly states RACK can make C.C. more aggressive.
>>     Disadvantage or not depends on how one looks at  it. IMO really
>>     beyond the scope of this doc.
>
>     :-)
>
>     Indeed, how you view this will depend on what is regarded as best.
>     I think it's kind of obvious there might be CC implications, maybe
>     there is no need to discuss that.
>
>>
>>         (6) Section 8.4 talks about ACK loss or a delayed ACK without
>>         a DSACK,
>>         but does not mention Stretch ACKs, which have also been
>>         experienced, and
>>         need some form of mention!
>>
>>     There are two kinds of stretched ACKs I knew of: (a) ACKs delayed
>>     by the receiver in hope to accumulate more.  (b) ACKs
>>     "compressed" or decimated by the receiver or middle-boxes.
>>     It's the former (a) that affects RACK so we use delayed ACK in
>>     the general delayed stretched ACK form.
>     So, on this I think any stretch ACK caused by ACK delay, loss in
>     the network, or intentional dropping has the same effect of
>     extending the period it takes for the endpoint to observe the ACK.
>     The typical stretch ACK intervals seem to me to be a few packets,
>     so I am not sure typically this creates an issue for RACK, but if
>     ACK delay is mentioned I don't understand why stretch ACKs from
>     other cases are also not equally mentioned?
>
> ok thanks for clarification. I propose we change to
>
> Delayed *or stretched ACKs *complicate the detection of repairs done 
> by TLP, since with such ACKs the sender *takes longer time to receive 
> fewer ACKs *than would normally be expected.
This seems to address my comment :-).
>
>>
>>         (7) Why isn't TCP-NCR (RFC4653) at least mentioned and
>>         discussed in the
>>         intro?
>>
>>     It is briefly mentioned in
>>     https://tools.ietf.org/html/draft-ietf-tcpm-rack-09#section-2.2 and
>>     discussed a bit more in
>>     https://tools.ietf.org/html/draft-ietf-tcpm-rack-09#section-8.2
>>
>     The existing discussion is fine (thanks), I did not see the RFC
>     listed in the References!
>
> Thanks for catching that. Will fix it. and double check if there're 
> missing refs. Probably another bug in our google-doc to rfc xml 
> conversion script...
>
>>
>>         Gorry
>>
>>
>     Best wishes,
>
>     Gorry
>
Thanks,

Gorry

-- 
G. Fairhurst, School of Engineering