Re: [tcpm] A review for draft-ietf-tcpm-rack-09

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Wed, 19 August 2020 07:27 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 259613A10C2 for <tcpm@ietfa.amsl.com>; Wed, 19 Aug 2020 00:27:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.835
X-Spam-Level:
X-Spam-Status: No, score=-2.835 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.949, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c06aWym4XahT for <tcpm@ietfa.amsl.com>; Wed, 19 Aug 2020 00:27:01 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id 5D34E3A10C1 for <tcpm@ietf.org>; Wed, 19 Aug 2020 00:27:01 -0700 (PDT)
Received: from GF-MacBook-Pro.lan (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id D2C2D1B000E5; Wed, 19 Aug 2020 08:26:56 +0100 (BST)
To: Yuchung Cheng <ycheng@google.com>
Cc: tcpm IETF list <tcpm@ietf.org>
References: <CAK6E8=d512Uvz-m37pkSg5Zxoq7Unvsf9rE8c2Kz0D-O4eQjng@mail.gmail.com> <C8CD0CD8-1364-4387-93AE-D9C2C6F7FF72@erg.abdn.ac.uk> <CAK6E8=duEmVz-m0K=f3ZUAu5BL3pnH1ENCo-VjZ+7zzuuHcSog@mail.gmail.com> <227bfb55-a2b4-37f0-9802-ddc76b22fa91@erg.abdn.ac.uk> <CAK6E8=emWx=OXc6OWbGsnsq2raAadd3QnY1GhtYgm-kBJJnVZw@mail.gmail.com>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <da2a131d-a25d-5299-da65-9d4c5db3c53f@erg.abdn.ac.uk>
Date: Wed, 19 Aug 2020 08:26:55 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.11.0
MIME-Version: 1.0
In-Reply-To: <CAK6E8=emWx=OXc6OWbGsnsq2raAadd3QnY1GhtYgm-kBJJnVZw@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------6C4FB28C15D806C61DB35DC9"
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/6jk78QQ-CVR8lJLwMTF5vxJZCBg>
Subject: Re: [tcpm] A review for draft-ietf-tcpm-rack-09
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Aug 2020 07:27:04 -0000

See below.

On 18/08/2020 21:28, Yuchung Cheng wrote:
>
>
> On Tue, Aug 18, 2020 at 6:42 AM Gorry Fairhurst <gorry@erg.abdn.ac.uk 
> <mailto:gorry@erg.abdn.ac.uk>> wrote:
>
>     I have read draft-ietf-tcpm-rack-09 in preparation for TCPM to
>     publish
>     this document, and do have some comments. I note that this new rev is
>     much easier to read this time, so thanks for the significant work to
>     produce a clear spec, and I see also that you have already
>     addressed my
>     major concerns - thanks. (I'll separately catch a set of minor
>     editorial
>     notes on this rev).
>
> Thank you for the review!
>
>
>     (1) I don’t understand this clause:
>     “  3.  The RACK reordering window SHOULD leverage that to adaptively
>     estimate the duration of reordering events, if the receiver uses
>     Duplicate Selective Acknowledgement (DSACK) [RFC2883].”
>     - What does “leverage that” actually mean?
>     - Might it mean something like: can be increased if the sender
>     receives
>     a Duplicate Selective Acknowledgement (DSACK) [RFC2883], that
>     suggests
>     the window is too small, and has resulted in spurious
>     retransmission." …
>     or does it mean something different?
>
> It means exactly what you said :-)
>
> Here is the latest rev based on Theresa' suggestion:
> "The RACK reordering window SHOULD adaptively increase if the sender 
> receives the Duplicate Selective Acknowledgement (DSACK) [RFC2883], 
> suggesting the window is too small which causes spurious retransmission."
>
> Let me know if that's better?
>
WFM.
>
>
>     (2) I did not understand this part in Section 3.3:
>     “However, the fact that the initial reordering window is low, and the
>         reordering window's adaptive growth is bounded, means that
>     there will
>         continue to be a cost to reordering to disincentivize excessive
>         network reordering over highly disjoint paths.  For such networks
>         there are good alternative solutions, such as MPTCP.“
>     - Is this intended to read as /reordering that disincentivizes
>     excessive/
>     - If that was intended, the spec appears to be for a single path,
>     MPTCP
>     would still view this path with reordering as a single path, so I
>     do not
>     understand how MPTCP helps unless the awareness of the disjoint
>     paths is
>     somehow known at the endpoints? Please explain.
>
>     - (You’d need a REF for MPTCP if you keep this senetnce and the
>     explanation).
>
> Yes that's the intention. Will removing the last sentence (about 
> MPTCP) help?
>
That would be an easy fix that would WFM.

>
>     (3)
>       Question in Section 7.4.2.
>     /If the TLP
>         sender does not receive such an indication, then it SHOULD assume
>         that either the original data segment or the TLP
>     retransmission were
>         lost, for congestion control purposes./
>     - Why is this not a MUST?
>     - Under what conditions would it be safe to ignore a SHOULD?
>
> We use SHOULD because the ACK could be lost instead of the data, so if 
> the sender has some mechanism to detect ACK losses well, it may be 
> safe not to assume data was lost.
>
> But I agree this is over-thinking the corner cases so how about
>
> If the TLP sender does not receive such an indication, then it *MUST* 
> assume that either the original data segment, or the TLP 
> retransmission were lost*, or their ACKs* are lost for congestion 
> control purposes.
>
I think the new text reads as safer to me.

>
>     (4) in Section 6.2. I was also left with a potential concern about
>     “min_RTT” with respect to “a simple global minimum of all RTT
>     measurements from the connection”.  This method results in a
>     significant
>     path change from low to high RTT exhibiting an invalid RTT. Even a
>     simple windowed min-filtered estimate - mentioned here - would avoid
>     this effect on reordering detection.
>
>     - I’d prefer adding a short sentence  so people know why the
>     updates to
>     min_RTT might be useful.
>
> Good point. (some) windowed-filter is definitely prefered. how about:
>
> The sender SHOULD tracka windowed min-filtered estimate of recent RTT 
> measurements to adapt migrating to significant longer paths, compared 
> to a simple global minimum of all RTT measurements.
>
>
I like that proposed text. I don't think teh method has to be 
sophisticated, but some method seems the right thing to do.
>
>
>     (5) In section 8.1, I expected more discussion of the potential
>     disadvantages (even if the authors seem to suggest these are not
>     significant) :
>
>     * I still was left without understanding whether there is an
>     impact from
>     a path with a varying RTT (perhaps one that uses a link layer
>     retransmission or access technology). My feeling is that if there is
>     excessive variation the method could go wrong when the RTT increases,
>     but I wonder if this is usually captured by inflating the SRTT and
>     that
>     more excessive variation is not common. I think this case should be
>     discussed briefly.
>
> AFAIU the scenario you are referring is when the link-layer 
> retransmitted packets took a much longer > current SRTT to be delivered.
Yes.
> Then RACK reordering window (and its SRTT-bound) requires DSACK and 
> new RTT measurement to raise to capture this variation. Depending on 
> the RTT variation, it may take multiple round trips to adapt to. The 
> idea is for short flows, RACK may not be able to cope with it (and we 
> welcome any simple idea to improve that).
Understood.
> For long flows, RACK should eventually adapt to these high varying 
> RTT. Hence we've tested RACK with high varying degree of reordering 
> (in time) in synthetic benchmarks. In production experiments, we have 
> not found such high variation too common tho: the wireless radio does 
> not alter its transmission rate too often or change its retry count 
> dynamically. But of course, we try to avoid any mention of "empirical 
> case frequency" :-)
>
> It that addresses your question, we can add some words in the 
> reordering design rationale about this.
>
I think a few words would help - just so that if the corner case is 
encountered people understand there was a tradeoff.
>
>
>     * As I read it, RACK can also send more retransmissions after
>     there has
>     been loss, where it is making spurious retransmissions - from timer
>     events, or as a result of multiple segments from a single TSO segment
>     being retransmitted. While it seems to be argued these are not
>     significant, it could none the less a disadvantage in some scenarios,
>     and should be captured in section 8.1.
>
> The first case spurious RACK timer expiration is possible. So we can 
> append to the end of 8.1 with
> "Another disadvantage is the reordering timer may expire prematurely 
> (like any other retransmission timer) to cause higher spurious 
> retransmission especially if DSACK is not supported".
>
That would be fine for me.
> However RACK should not cause extra spurious retransmission with TSO 
> (vs non-TSO) as a loss detection algorithm. Perhaps some text in TSO 
> was not clear on this?
>
>
>     * Also there seem cases where RACK allows TCP to continue to increase
>     the congestion window upon receiving ACKs after loss, making the
>     sender
>     more aggressive. This is noted in 8.3, but I think should be
>     listed in
>     the disadvantages in 8.1.
>
> Section 8.3 clearly states RACK can make C.C. more aggressive. 
> Disadvantage or not depends on how one looks at it. IMO really beyond 
> the scope of this doc.

:-)

Indeed, how you view this will depend on what is regarded as best. I 
think it's kind of obvious there might be CC implications, maybe there 
is no need to discuss that.

>
>     (6) Section 8.4 talks about ACK loss or a delayed ACK without a
>     DSACK,
>     but does not mention Stretch ACKs, which have also been
>     experienced, and
>     need some form of mention!
>
> There are two kinds of stretched ACKs I knew of: (a) ACKs delayed by 
> the receiver in hope to accumulate more.  (b) ACKs "compressed" or 
> decimated by the receiver or middle-boxes.
> It's the former (a) that affects RACK so we use delayed ACK in the 
> general delayed stretched ACK form.
So, on this I think any stretch ACK caused by ACK delay, loss in the 
network, or intentional dropping has the same effect of extending the 
period it takes for the endpoint to observe the ACK. The typical stretch 
ACK intervals seem to me to be a few packets, so I am not sure typically 
this creates an issue for RACK, but if ACK delay is mentioned I don't 
understand why stretch ACKs from other cases are also not equally mentioned?
>
>
>     (7) Why isn't TCP-NCR (RFC4653) at least mentioned and discussed
>     in the
>     intro?
>
> It is briefly mentioned in 
> https://tools.ietf.org/html/draft-ietf-tcpm-rack-09#section-2.2 and 
> discussed a bit more in 
> https://tools.ietf.org/html/draft-ietf-tcpm-rack-09#section-8.2
>
The existing discussion is fine (thanks), I did not see the RFC listed 
in the References!
>
>
>     Gorry
>
>
Best wishes,

Gorry