Re: [tcpm] Review of draft-ietf-tcpm-rack-07

Ilpo Järvinen <ilpo.jarvinen@cs.helsinki.fi> Mon, 06 April 2020 21:02 UTC

Return-Path: <ilpo.jarvinen@cs.helsinki.fi>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3FB693A0D7D for <tcpm@ietfa.amsl.com>; Mon, 6 Apr 2020 14:02:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cs.helsinki.fi
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Q5SZIbhdQ7yq for <tcpm@ietfa.amsl.com>; Mon, 6 Apr 2020 14:02:09 -0700 (PDT)
Received: from script.cs.helsinki.fi (script.cs.helsinki.fi [128.214.11.1]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8CE763A0CB6 for <tcpm@ietf.org>; Mon, 6 Apr 2020 14:01:54 -0700 (PDT)
X-DKIM: Courier DKIM Filter v0.50+pk-2017-10-25 mail.cs.helsinki.fi Tue, 07 Apr 2020 00:01:46 +0300
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.helsinki.fi; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; s=dkim20130528; bh=HRvbP3gBVSFXnuZaf 5k6QDURlLo2vO5IAayDPs6w9l0=; b=WY4fYfNumU3pzVALdg7DQ3KoLcVUPYrJh 4ZEi8YVOqC46F4JV0J+1DyU+k94+JZPT1NJzdJUXWEORXWenXQ5knbrMhaQoHjIL Oai05VrdtHCUWpbEvbN1p4LSbsUTDtTiIq2Ywb0rvhlgmHpDAgvEI6Q5f4V26Y9c Iqa+SSdciU=
Received: from whs-18.cs.helsinki.fi (whs-18.cs.helsinki.fi [128.214.166.46]) (TLS: TLSv1/SSLv3,256bits,AES256-GCM-SHA384) by mail.cs.helsinki.fi with ESMTPS; Tue, 07 Apr 2020 00:01:46 +0300 id 00000000005A014E.000000005E8B98BA.00003315
Date: Tue, 07 Apr 2020 00:01:46 +0300
From: Ilpo Järvinen <ilpo.jarvinen@cs.helsinki.fi>
X-X-Sender: ijjarvin@whs-18.cs.helsinki.fi
To: Yuchung Cheng <ycheng@google.com>
cc: "tcpm@ietf.org Extensions" <tcpm@ietf.org>, Neal Cardwell <ncardwell@google.com>, Nandita Dukkipati <nanditad@google.com>, Priyaranjan Jha <priyarjha@google.com>
In-Reply-To: <CAK6E8=dZN5PnCt7QFoyGJUnSoCcg0XUKxyZCbJAM_2tVN7A8wQ@mail.gmail.com>
Message-ID: <alpine.DEB.2.20.2004062342530.14114@whs-18.cs.helsinki.fi>
References: <alpine.DEB.2.20.2002141414420.25645@whs-18.cs.helsinki.fi> <CAK6E8=dZN5PnCt7QFoyGJUnSoCcg0XUKxyZCbJAM_2tVN7A8wQ@mail.gmail.com>
User-Agent: Alpine 2.20 (DEB 67 2015-01-07)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=_script-13103-1586206906-0001-2"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/wzBTx52UD6UEBqGDDWhLK6BcdVU>
Subject: Re: [tcpm] Review of draft-ietf-tcpm-rack-07
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2020 21:02:16 -0000

Hi Yuchung,

Certainly improved on many things, however, I'm surprised that the lost 
rexmit comment was not addressed at all. Aren't the rexmit losses 
supposed to be detected by RACK or have I misunderstood something?

I don't understand how lost rexmits are detected by the algorithms in
the draft. I think the algorithms are either flawed or lack something 
(mainly RACK_detect_loss is relevant I think). My original comment
from below.

       Is RACK_detect_loss() supposed to detect also lost rexmits? How?
       The rexmit code itself is not in the draft so it's hard to evaluate
       how the Packet booleans are supposed to work in the first place.
       I'm not saying the rexmit code should be included but currently it
       surely looks like any rexmitted packet just remains .lost = TRUE
       and .retransmitted = TRUE state until RTO?

Also my timeout "some" vs "all" internal inconsistency comment relates to 
this same algorithm:

         "For timely loss detection, the sender MAY
          install a "reordering settling" timer set to fire at the earliest
          moment at which it is safe to conclude that some packet is lost.  The
          earliest moment is the time it takes to expire the reordering window
          of the earliest unacked packet in flight."
       vs
          RACK_detect_loss():
             ...
                 timeout = max(remaining, timeout)
       Which way it is, "some" or "all"? The use of max() in RACK_detect_loss()
       seems to indicate that the timeout will be set according to the longest
       time, that is, the latest moment rather than earliest moments. Which
       is the intented behavior here? (I can there might be merits in both
       approaches).


Then there are some other comments I made which seem also unaddressed but 
I'll not duplicate them all here.


-- 
 i.


On Mon, 9 Mar 2020, Yuchung Cheng wrote:

> Hi Ilpo,
> Thanks for a detailed review and the great suggestions. Here's all the edits
> we made.
> https://www.ietf.org/rfcdiff?url2=draft-ietf-tcpm-rack-08
> 
>  We defined the missing terms and addressed the nits, removed the less
> critical parts like recommending PRR, and edited TLP algorithms (sections
> 7.4 to 7.6). I hope you find it's better now and would be happy to hear any
> improvements you'd like to make.
> 
> Yuchung
> 
> On Fri, Feb 14, 2020 at 4:22 AM Ilpo Järvinen <ilpo.jarvinen@cs.helsinki.fi>
> wrote:
>       In general, I find the draft quite good shape already, it's
>       quite easy
>       to read and follow (that is, except the TLP sections that are
>       specifically
>       mentioned below).
> 
>       Here's the list of issues I came across:
> 
>       Sect 6.1
> 
>       "RACK.pkts_sacked" lacks the empty line separator.
> 
>       Sect 7.2
> 
>         "This
>          heuristic would fail to update RACK stats if the packet is
>       spuriously
>          retransmitted because of a recent minimum RTT decrease (e.g.
>       path
>          change)."
>       This defies my logic as RTT decrease should not result in
>       spurious
>       retransmissions but help to avoid them.
> 
> 
>         "Note that
>          extensions that require additional TCP features (e.g.  DSACK)
>       would
>          work if the feature functions simply return false."
>       Please rephrase this. I suspect you meant that if some functions
>       relate
>       to TCP features that are not available, the given code still
>       works
>       as long as the functions related to such features return false
>       when
>       the TCP feature is not available but that surely isn't what the
>       sentence currently says.
> 
>       Sect 7.2 Step 5
> 
>         "For timely loss detection, the sender MAY
>          install a "reordering settling" timer set to fire at the
>       earliest
>          moment at which it is safe to conclude that some packet is
>       lost.  The
>          earliest moment is the time it takes to expire the reordering
>       window
>          of the earliest unacked packet in flight."
>       vs
>          RACK_detect_loss():
>             ...
>                 timeout = max(remaining, timeout)
>       Which way it is, "some" or "all"? The use of max() in
>       RACK_detect_lost()
>       seems to indicate that the timeout will be set according to the
>       longest
>       time, that is, the latest moment rather than earliest moments.
>       Which
>       is the intented behavior here? (I can there might be merits in
>       both
>       approaches).
> 
> 
>       In general, I didn't like the presentation order here as this
>       form looks
>       the most obvious to me:
>          now >= Packet.xmit_ts + RACK.reo_wnd + (RACK.ack_ts -
>       RACK.xmit_ts)
>       ...and I find the derivation of it just complicating things
>       rather
>       than helping but that's perhaps just my opinion.
> 
> 
>          "It is also useful when the timestamp clock granularity is
>       close to
>           or longer than the actual round trip time."
>       Given that some things based on timestamps will obviously stop
>       working
>       if the clock is slower than one tick per RTT I found this
>       fragment a bit
>       odd. RFC 7323 says "the clock SHOULD tick at least once per
>       window's worth
>       of data". I can understand though that a host might find clock
>       tick
>       granularity challenging due to extensive RTT range but on the
>       same
>       I'd not want such a "too slow clock" behavior to be endorsed as
>       if
>       it would be just normal. That is, I'm not against RACK handling
>       also this case nicely but it would be nice to somehow downplay
>       the
>       normalness tone in it with something along the lines of "This
>       check is
>       also useful if the recommended one tick per RTT cannot be
>       achieved due
>       to timestamp clock granularity."
> 
>       Is RACK_detect_lost() supposed to detect also lost rexmits? How?
>       The rexmit code itself is not in the draft so it's hard to
>       evaluate
>       how the Packet booleans are supposed to work in the first place.
>       I'm not saying the rexmit code should be included but currently
>       it
>       surely looks like any rexmitted packet just remains .lost = TRUE
>       and
>       .retransmitted = TRUE state until RTO?
> 
> 
>       Sect 7.4
> 
>       "PTO" just appears out of nowhere (and is not even written open
>       in the first instance).
> 
>          "2. ... in step (2) of the algorithm."
>       "step (2)" is forward-refing, it would naturally refer back to
>       "Step 2: Update RACK stats" so it makes very little sense as is.
> 
>       Sect 7.5
> 
>       Separator missing for PTO: and TLPRxtOut:
> 
>       Sect 7.5.1
> 
>         "3.  Upon receiving a SACK that selectively acknowledges data
>       that was
>              last sent before the segment with SEG.SEQ=SND.UNA was
>       last
>              (re)transmitted"
>       This is hard to understand, perhaps rephrase it somehow?
> 
> 
>         "But choosing PTO to be
>          exactly an SRTT is likely to generate spurious probes given
>       that
>          network delay variance and even end-system timings can easily
>       push an
>          ACK to be above an SRTT.  We chose PTO to be the next
>       integral
>          multiple of SRTT.
> 
>          Similarly, network delay variations and end-system processing
>          latencies and timer granularities can easily delay ACKs
>       beyond
>          2*SRTT, so senders SHOULD add at least 2ms to a computed PTO
>       value
>          (and MAY add more if the sending host OS timer granularity is
>       more
>          coarse than 1ms)."
>       Does this later paragraph depend on some assumptions (such as
>       very
>       low RTT?) as I don't otherwise understand why you say what you
>       say
>       in the given order. First you said x & y can push ACK over SRTT
>       so
>       we pick 2* but then you say the same things can push over 2*SRTT
>       too?
> 
>       WCDelAckT to terminology?
> 
>       Sect 7.5.2
> 
>       TLP_send_probe():
>               - Perhaps set TLPRxtOut somewhere?
>               - Add "Arm RTO" to the end?
> 
>         "In the case where
>          there is only one original outstanding segment of data (N=1),
>       the
>          same logic (trivially) applies: an ACK for a single
>       outstanding
>          segment tells the sender the N-1=0 segments preceding that
>       segment
>          were lost.  Furthermore, whether there are N>1 or N=1
>       outstanding
>          segments,"
>       If this distinction is useful to make at all, maybe just simply
>       say:
>       "Applies also in case N=1."
> 
>       Sect 7.6
> 
>         "SND.NXT at the time the episode started."
>       Isn't this TLPHighRxt, why not use it here?
> 
>       Sect 7.6.2
> 
>         "Upon sending a TLP probe that is a retransmission, the sender
>       sets
>          TLPRxtOut to true and TLPHighRxt to SND.NXT."
>       This should appear in TLP_send_probe().
> 
>       "Detecting recoveries accomplished by loss probes"
>       Did you want to start another section here?
> 
>       "Step 1: Track ACKs ..." until what point? (It's told on later
>       but
>       this question comes into mind when reading the draft in order).
> 
> 
>       Sect 7.4 - 7.6.2
> 
>       In general, this particular part of the draft seems to require
>       some
>       further editing. After reading it all through, I feel most bits
>       are
>       present but the text is very hard to follow. I think the
>       presentation
>       order needs significant work. There are random bits here and
>       there
>       that are explained only later. Also, algorithm lacks some parts
>       discussed only in the text but variable like items should
>       definitely
>       be included to the algorithm.
> 
> 
>       Sect 8.4
> 
>         "RACK is compatible with and does not interfere with the the
>       standard
>          RTO [RFC6298], RTO-restart [RFC7765], F-RTO [RFC5682] and
>       Eifel
>          algorithms [RFC3522]. ...
>          It neither changes the RTO timer calculation nor detects
>       spurious
>          timeouts."
>       For F-RTO, also send order is significant.
> 
>       Sect 8.5
> 
>       SHOULDs PRR (and also placed to normative references) that is
>       only
>       experimental and RACK aims to standards track. I feel that PRR
>       is
>       disjoint enough anyway, which I think was also your intention
>       when
>       maintaining the decoupling CC and RACK, that it's bit odd to
>       take
>       this strong stance on PRR (indepedent of the stds track vs
>       experimental thing).
> 
>         "due to congestion window validation."
>       Does this refer to CWV as in RFC? Or some other mechanism in
>       which
>       case you shouldn't use the same term here.
> 
>       I find the described scenario somewhat unconvincing becase
>       the description ends with:
>         "...
>          The ending congestion window after recovery also impacts
>          subsequent data transfer."
>       ...First there wasn't enough data to fill cwnd of 20 packets and
>       then suddenly transferring that non-existing data is impacted.
> 
>       References
> 
>       QUIC-LR is very old ref and even the filename has changed a few
>       times from draft-tsvwg-...
> 
> 
>       Nits
> 
>       "runt" made the sentence incomprehendable for me, I had to look
>       up
>       that word (and only second source was descriptive enough that I
>       think I could understand the meaning, and I'd guessed wrong
>       because
>       I thought there was some special intent here because of the
>       quotes).
> 
>       is dicussed -> is discussed
>       RACT.RTT -> RACK.rtt
>       First "TSO" -> TCP Segmentation Offload (TSO)
>       grainularity -> granularity
>       higher-sequenc -> higher-sequence
> 
> 
>       --
>        i.
> 
> 
>