Re: [tcpm] Review of draft-ietf-tcpm-rack-07

Yuchung Cheng <ycheng@google.com> Mon, 09 March 2020 22:53 UTC

Return-Path: <ycheng@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 22BE33A0867 for <tcpm@ietfa.amsl.com>; Mon, 9 Mar 2020 15:53:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.599
X-Spam-Level:
X-Spam-Status: No, score=-17.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HifTSGabTQHk for <tcpm@ietfa.amsl.com>; Mon, 9 Mar 2020 15:53:30 -0700 (PDT)
Received: from mail-vs1-xe36.google.com (mail-vs1-xe36.google.com [IPv6:2607:f8b0:4864:20::e36]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 10F0B3A08CA for <tcpm@ietf.org>; Mon, 9 Mar 2020 15:53:29 -0700 (PDT)
Received: by mail-vs1-xe36.google.com with SMTP id k188so7197062vsc.8 for <tcpm@ietf.org>; Mon, 09 Mar 2020 15:53:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EcR1KHqNQw+LtLTd0zRIZlITP/Qqvi0Y7tF9GIQTf2M=; b=X42U8JA8RO3frVBexT5xgx+fVPiXLnHabgrDFTi4xcQZ5o+BxcOBJxCGJqaseaaCHl EdaMRm7XcnV0yJNd61+RIciItPwWy1jeZpE1UW4xbCXi30y5v2bwNnT2SSSRIa4OfZb7 LoMUPgCpipYPE1VhkuPcuITJ46OvKshi/IMabXQpBEYREzOimv5FkSF4tcCAi6NTWof7 kuzbbE6Ok3eNFfNidhRoOfLG2aDZkmmZjpVVv3Y3KL+hGXJP0e7TaWZdDUwmggwtVwYp YPe9wBpVEILMhzXOxopgwaLsc/rJulCpEdx3ivVkiTdpaUExSugnQ6AH0wNY/9cdgvP4 Yf9A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EcR1KHqNQw+LtLTd0zRIZlITP/Qqvi0Y7tF9GIQTf2M=; b=CeauxIs4dBWDN96pSeqm/dZaAVbSyu/17IBJvO6Cuq+UbJuLCVVF9lHXYrz82HGbjq D+N6pJgHbzZrKhnyKNs8fVNjGalzKyVXLsRtGHzH3CK/8OOG7iBnBriJhGlIqqmoxlCv f1dctJZi6CfxH2O1+dHwZitS7CjfJv23x0eGLGHUdcQaPRWHl/2IwwpXvooO7BM+uqJV 2mhL1ihUw74ilx9LJ1BCkUMoKUHDdA1lQmONf/Kxf0erg6flf5GSHc/Vp8SK6Lv44FsI NByu1PytoaZSLCxOudMEuIFtfDTZVOrUEGU6YxcskMJzhFPqp5kk2qP2svc+qTv5rWys EXQg==
X-Gm-Message-State: ANhLgQ2vkjlnvQKtTIGHDgbqOdz0H8tGht3DZQxkqovdyuLmk6MVPTs9 c4/v/KsQOkM8erL0l6Tw45W4jKaWTFlVstfS+0npyw==
X-Google-Smtp-Source: ADFU+vs7Gcf/cCnRCQW7KYtFsQrcc2rZvbHB0eFHh21TpL9JCAQ43rIFydzYgZYnsim2CpIFPKDywF9L/SP1158hI7A=
X-Received: by 2002:a67:6e83:: with SMTP id j125mr10991795vsc.123.1583794408582; Mon, 09 Mar 2020 15:53:28 -0700 (PDT)
MIME-Version: 1.0
References: <alpine.DEB.2.20.2002141414420.25645@whs-18.cs.helsinki.fi>
In-Reply-To: <alpine.DEB.2.20.2002141414420.25645@whs-18.cs.helsinki.fi>
From: Yuchung Cheng <ycheng@google.com>
Date: Mon, 09 Mar 2020 15:52:51 -0700
Message-ID: <CAK6E8=dZN5PnCt7QFoyGJUnSoCcg0XUKxyZCbJAM_2tVN7A8wQ@mail.gmail.com>
To: Ilpo Järvinen <ilpo.jarvinen@cs.helsinki.fi>
Cc: "tcpm@ietf.org Extensions" <tcpm@ietf.org>, Neal Cardwell <ncardwell@google.com>, Nandita Dukkipati <nanditad@google.com>, Priyaranjan Jha <priyarjha@google.com>
Content-Type: multipart/alternative; boundary="0000000000001f5aa705a073df45"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/8kxOkHK3XOC6wm6eQvDPCFEH5P4>
Subject: Re: [tcpm] Review of draft-ietf-tcpm-rack-07
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Mar 2020 22:53:34 -0000

Hi Ilpo,

Thanks for a detailed review and the great suggestions. Here's all the
edits we made.
https://www.ietf.org/rfcdiff?url2=draft-ietf-tcpm-rack-08

 We defined the missing terms and addressed the nits, removed the less
critical parts like recommending PRR, and edited TLP algorithms (sections
7.4 to 7.6). I hope you find it's better now and would be happy to hear any
improvements you'd like to make.

Yuchung

On Fri, Feb 14, 2020 at 4:22 AM Ilpo Järvinen <ilpo.jarvinen@cs.helsinki.fi>
wrote:

> In general, I find the draft quite good shape already, it's quite easy
> to read and follow (that is, except the TLP sections that are specifically
> mentioned below).
>
> Here's the list of issues I came across:
>
> Sect 6.1
>
> "RACK.pkts_sacked" lacks the empty line separator.
>
> Sect 7.2
>
>   "This
>    heuristic would fail to update RACK stats if the packet is spuriously
>    retransmitted because of a recent minimum RTT decrease (e.g. path
>    change)."
> This defies my logic as RTT decrease should not result in spurious
> retransmissions but help to avoid them.
>
>
>   "Note that
>    extensions that require additional TCP features (e.g.  DSACK) would
>    work if the feature functions simply return false."
> Please rephrase this. I suspect you meant that if some functions relate
> to TCP features that are not available, the given code still works
> as long as the functions related to such features return false when
> the TCP feature is not available but that surely isn't what the
> sentence currently says.
>
> Sect 7.2 Step 5
>
>   "For timely loss detection, the sender MAY
>    install a "reordering settling" timer set to fire at the earliest
>    moment at which it is safe to conclude that some packet is lost.  The
>    earliest moment is the time it takes to expire the reordering window
>    of the earliest unacked packet in flight."
> vs
>    RACK_detect_loss():
>       ...
>           timeout = max(remaining, timeout)
> Which way it is, "some" or "all"? The use of max() in RACK_detect_lost()
> seems to indicate that the timeout will be set according to the longest
> time, that is, the latest moment rather than earliest moments. Which
> is the intented behavior here? (I can there might be merits in both
> approaches).
>
>
> In general, I didn't like the presentation order here as this form looks
> the most obvious to me:
>    now >= Packet.xmit_ts + RACK.reo_wnd + (RACK.ack_ts - RACK.xmit_ts)
> ...and I find the derivation of it just complicating things rather
> than helping but that's perhaps just my opinion.
>
>
>    "It is also useful when the timestamp clock granularity is close to
>     or longer than the actual round trip time."
> Given that some things based on timestamps will obviously stop working
> if the clock is slower than one tick per RTT I found this fragment a bit
> odd. RFC 7323 says "the clock SHOULD tick at least once per window's worth
> of data". I can understand though that a host might find clock tick
> granularity challenging due to extensive RTT range but on the same
> I'd not want such a "too slow clock" behavior to be endorsed as if
> it would be just normal. That is, I'm not against RACK handling
> also this case nicely but it would be nice to somehow downplay the
> normalness tone in it with something along the lines of "This check is
> also useful if the recommended one tick per RTT cannot be achieved due
> to timestamp clock granularity."
>
> Is RACK_detect_lost() supposed to detect also lost rexmits? How?
> The rexmit code itself is not in the draft so it's hard to evaluate
> how the Packet booleans are supposed to work in the first place.
> I'm not saying the rexmit code should be included but currently it
> surely looks like any rexmitted packet just remains .lost = TRUE and
> .retransmitted = TRUE state until RTO?
>
>
> Sect 7.4
>
> "PTO" just appears out of nowhere (and is not even written open
> in the first instance).
>
>    "2. ... in step (2) of the algorithm."
> "step (2)" is forward-refing, it would naturally refer back to
> "Step 2: Update RACK stats" so it makes very little sense as is.
>
> Sect 7.5
>
> Separator missing for PTO: and TLPRxtOut:
>
> Sect 7.5.1
>
>   "3.  Upon receiving a SACK that selectively acknowledges data that was
>        last sent before the segment with SEG.SEQ=SND.UNA was last
>        (re)transmitted"
> This is hard to understand, perhaps rephrase it somehow?
>
>
>   "But choosing PTO to be
>    exactly an SRTT is likely to generate spurious probes given that
>    network delay variance and even end-system timings can easily push an
>    ACK to be above an SRTT.  We chose PTO to be the next integral
>    multiple of SRTT.
>
>    Similarly, network delay variations and end-system processing
>    latencies and timer granularities can easily delay ACKs beyond
>    2*SRTT, so senders SHOULD add at least 2ms to a computed PTO value
>    (and MAY add more if the sending host OS timer granularity is more
>    coarse than 1ms)."
> Does this later paragraph depend on some assumptions (such as very
> low RTT?) as I don't otherwise understand why you say what you say
> in the given order. First you said x & y can push ACK over SRTT so
> we pick 2* but then you say the same things can push over 2*SRTT
> too?
>
> WCDelAckT to terminology?
>
> Sect 7.5.2
>
> TLP_send_probe():
>         - Perhaps set TLPRxtOut somewhere?
>         - Add "Arm RTO" to the end?
>
>   "In the case where
>    there is only one original outstanding segment of data (N=1), the
>    same logic (trivially) applies: an ACK for a single outstanding
>    segment tells the sender the N-1=0 segments preceding that segment
>    were lost.  Furthermore, whether there are N>1 or N=1 outstanding
>    segments,"
> If this distinction is useful to make at all, maybe just simply say:
> "Applies also in case N=1."
>
> Sect 7.6
>
>   "SND.NXT at the time the episode started."
> Isn't this TLPHighRxt, why not use it here?
>
> Sect 7.6.2
>
>   "Upon sending a TLP probe that is a retransmission, the sender sets
>    TLPRxtOut to true and TLPHighRxt to SND.NXT."
> This should appear in TLP_send_probe().
>
> "Detecting recoveries accomplished by loss probes"
> Did you want to start another section here?
>
> "Step 1: Track ACKs ..." until what point? (It's told on later but
> this question comes into mind when reading the draft in order).
>
>
> Sect 7.4 - 7.6.2
>
> In general, this particular part of the draft seems to require some
> further editing. After reading it all through, I feel most bits are
> present but the text is very hard to follow. I think the presentation
> order needs significant work. There are random bits here and there
> that are explained only later. Also, algorithm lacks some parts
> discussed only in the text but variable like items should definitely
> be included to the algorithm.
>
>
> Sect 8.4
>
>   "RACK is compatible with and does not interfere with the the standard
>    RTO [RFC6298], RTO-restart [RFC7765], F-RTO [RFC5682] and Eifel
>    algorithms [RFC3522]. ...
>    It neither changes the RTO timer calculation nor detects spurious
>    timeouts."
> For F-RTO, also send order is significant.
>
> Sect 8.5
>
> SHOULDs PRR (and also placed to normative references) that is only
> experimental and RACK aims to standards track. I feel that PRR is
> disjoint enough anyway, which I think was also your intention when
> maintaining the decoupling CC and RACK, that it's bit odd to take
> this strong stance on PRR (indepedent of the stds track vs
> experimental thing).
>
>   "due to congestion window validation."
> Does this refer to CWV as in RFC? Or some other mechanism in which
> case you shouldn't use the same term here.
>
> I find the described scenario somewhat unconvincing becase
> the description ends with:
>   "...
>    The ending congestion window after recovery also impacts
>    subsequent data transfer."
> ...First there wasn't enough data to fill cwnd of 20 packets and
> then suddenly transferring that non-existing data is impacted.
>
> References
>
> QUIC-LR is very old ref and even the filename has changed a few
> times from draft-tsvwg-...
>
>
> Nits
>
> "runt" made the sentence incomprehendable for me, I had to look up
> that word (and only second source was descriptive enough that I
> think I could understand the meaning, and I'd guessed wrong because
> I thought there was some special intent here because of the quotes).
>
> is dicussed -> is discussed
> RACT.RTT -> RACK.rtt
> First "TSO" -> TCP Segmentation Offload (TSO)
> grainularity -> granularity
> higher-sequenc -> higher-sequence
>
>
> --
>  i.
>