Re: [core] Review of draft-ietf-core-fasor-01

Carles Gomez Montenegro <carles.gomez@upc.edu> Sun, 01 October 2023 11:51 UTC

Return-Path: <carles.gomez@upc.edu>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 75533C1522B9 for <core@ietfa.amsl.com>; Sun, 1 Oct 2023 04:51:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.908
X-Spam-Level:
X-Spam-Status: No, score=-6.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=upc-edu.20230601.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6jkzbe0ftSSp for <core@ietfa.amsl.com>; Sun, 1 Oct 2023 04:51:34 -0700 (PDT)
Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 400B7C14CE5E for <core@ietf.org>; Sun, 1 Oct 2023 04:51:33 -0700 (PDT)
Received: by mail-wm1-x336.google.com with SMTP id 5b1f17b1804b1-4064867903cso54903745e9.2 for <core@ietf.org>; Sun, 01 Oct 2023 04:51:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=upc-edu.20230601.gappssmtp.com; s=20230601; t=1696161092; x=1696765892; darn=ietf.org; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=af5pAQUNyLbd73g47vcsk0MZUeV8KzOigakFeWRWSXY=; b=zb6iDDGRTgLupHoj4Xl8k88YFuImzW1QqEGHUTYZtCahPshWEpsi1cC+oVD0m7lEAb /H9B9z83lR/lUU0O6glSMz19WKQkMCPociZbHY4qkP/c/Qzo4WCqCaflmA1izX7mtY0K T1Orz24n9JMB3OKHAVqIMEB/dcJdAmwf00oLQIlDQT/N/Ftr6SdvQnXbfG/Pe4KIhtg8 MzV+YBGjV6nHsncEMwTyknW9NdWWeTfwCUb4g6id0ebC82ihY8VcqfgT0/yazX5zVqkG lNdqnRY3fkAfeBbvoz9YhAClRxgHpNU9NiBcu3iPBQ/3obBGP/s9CejDpLUJg8z1gS7P 8AxA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696161092; x=1696765892; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=af5pAQUNyLbd73g47vcsk0MZUeV8KzOigakFeWRWSXY=; b=Fq0dQNLtX9VOGlVE1LilzKgzEFLZ2QsRgZqh7zHzD80fDj1fL+Z7NeZX5kRWMjEg1K aMl4e9OPjuDdZctrCHPI3KOVY5zaPF/Y996krnonz+eXSTZvnFKbUtHZVQTvVjKvGTZe bz+2htqoH3lvAF+Ru+PueQfDYCuoa4cRWAtzEqrdMM3EtdqWaJqX9Ct+SkbvNonNQ2U7 GFkUApLxSV9WIC+QPctc5k+Kbgcn2w9RnvD2x/qZdRSMqHGqpVK4a5LeJvvoZ+vvtjIa MAZudF+EZQfw8K9p7o5N+0PW/8N1uADVdCNJgreVt7bv3b9gWT5DhCkoR0mMxrAl014A znuA==
X-Gm-Message-State: AOJu0YxqYDXJXVqthglMA1xQrXV1VohdqgjW0DR1+WxjS50rABmH1u9F n7VEOd7okjC66YYThr/GrpIOGqbmQxYDEugjlPC+HjLfX9lrQyQlns0=
X-Google-Smtp-Source: AGHT+IHoSYNj07oe/IiIa5x+lVpELb6CdSLh8HDv47x63R+uhK3PShTmtKMQSIX205NseVrzU46qsiAnAhiJpW5dlUM=
X-Received: by 2002:a1c:f717:0:b0:402:e68f:888e with SMTP id v23-20020a1cf717000000b00402e68f888emr8149461wmh.4.1696161092207; Sun, 01 Oct 2023 04:51:32 -0700 (PDT)
MIME-Version: 1.0
References: <5e1904ad67f56adb0001ee0d25488349.squirrel@webmail.entel.upc.edu> <alpine.DEB.2.21.2303161250360.4394@hp8x-60.cs.helsinki.fi>
In-Reply-To: <alpine.DEB.2.21.2303161250360.4394@hp8x-60.cs.helsinki.fi>
Reply-To: carles.gomez@upc.edu
From: Carles Gomez Montenegro <carles.gomez@upc.edu>
Date: Sun, 01 Oct 2023 13:51:20 +0200
Message-ID: <CAAUO2xwzXJcCS590ihp3v16sfFnTD0ZGq9qP-S6CFb+QG0kb2w@mail.gmail.com>
To: Markku Kojo <kojo@cs.helsinki.fi>
Cc: core@ietf.org, draft-ietf-core-fasor.authors@ietf.org
Content-Type: multipart/alternative; boundary="000000000000622d4d0606a6478c"
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/asz9NgDTEaXduHdq2rz4zErkpws>
Subject: Re: [core] Review of draft-ietf-core-fasor-01
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Oct 2023 11:51:36 -0000

Dear Markku,

First of all, my apologies for the long delay in this response.

Thank you for taking my comments into consideration.

Please find below my inline responses:

On Thu, 16 Mar 2023 at 11:53, Markku Kojo <kojo@cs.helsinki.fi> wrote:

> Dear Carles,
>
> thank you very much for your thorough review. It is much appreciated and
> useful, although this reply comes so very late as the draft has been
> dormant for so long for various reasons. Apologies for that.
>
>
I am happy that my comments have been helpful, and no worries regarding the
delay (I have also been quite blocked recently).


> We believe we have addressed most of the suggestions in -02. In the few
> cases where we possibly did not (fully) address the suggestions we hope
> our answers and clarifications make sense and we may agree on how to
> address those cases (or if they need not to be addressed).
>
>
In my opinion, my comments have been well addressed, perhaps with only a
few pending/additional minor points, as shown below (absence of a comment
or a response means that I agree or no further action is requested from my
side).


> Please see inline some comments and answers.
>
> On Sun, 7 Feb 2021, Carles Gomez Montenegro wrote:
>
> > Dear FASOR authors,
> >
> > First of all, my apologies for providing you with this review later than
> I
> > expected. I hope it is not too late, considering that the cutoff date is
> > quickly approaching!
> >
> > Thank you for the document. Please find below a number of comments and
> > suggestions:
> >
> > - Section 1, first paragraph: "Both default RTO mechanism for CoAP
> > [RFC7252] and CoCoA [I-D.ietf-core-cocoa] have issues in dealing with
> > unnecessary retransmissions". CoCoA is still work in progress, and there
> > are ideas on the table to improve it. Perhaps some different phrasing
> (for
> > the sentence  I highlighted here) might better support future updates of
> > CoCoA.
>
> Fair enough. Would it be better now when changed to:
>
>   "Both default RTO mechanism for CoAP [RFC7252] and the latest
>    version of CoCoA [I-D.ietf-core-cocoa] have issues in dealing
>    with unnecessary retransmissions"
>
> In order to make the text compatible with possible future versions of
CoCoA, I'd suggest something along the lines of:

NEW:
       "Both default RTO mechanism for CoAP [RFC7252] and the current
        version (i.e., as of the writing, -03) of CoCoA
[I-D.ietf-core-cocoa] have issues in dealing
        with unnecessary retransmissions"

I also think that it would be good to add some sentence at the end of the
same paragraph such as:

NEW:
        "Note that, hereinafter, any mention of CoCoA refers to its version
-03".


> > - Section 1, last paragraph: s/unnecessary retransmissions that a sender
> > may have sent due to an inaccurate RTT estimate/unnecessary
> > retransmissions due to an inaccurate RTT estimate
>
> Done.
>
> > - The RTO acronym is defined (preceded by its expanded form) twice in the
> > document. Also, there are instances of both "RTO" and retransmission
> > timeout throughout the document. Once "RTO" has been introduced, just the
> > acronym should probably be used.
>
> Good catch.
> Hope it's better now.
>
> > - Page 3: "However, there are couple of exceptions when the RTT
> estimation
> > is not available". Well, the second bullet after that sentence
> corresponds
> > rather to "is not updated" than "is not available".
>
> I think the text is correct as it says *RTT estimation*, i.e., the
> process of making the estimate, is not available. However, it is now
> modified to read:
>   "... when the RTT estimate is not available or it cannot be updated".
>
> Hope that works better.
>
> > - Page 3: "- At the beginning of a flow where an initial RTO of 2
> seconds is
> > used." As a reader, I am not sure if this is mentioned as a negative
> > aspect of CoCoA. Perhaps not, but it might be good to emphasize that
> > something like this is unavoidable (and, as I understand, it is the same
> > with FASOR), whereas the second bullet is where a problem may actually
> > happen.
>
> True, unnecessary rexmits are unavoidable with an initial RTO if the
> actual RTT is high enough. This holds for FASOR as well. But the problem
> is not alone in the inability to avoid unnecessary rexmits and take a new
> RTT sample. With CoCoA, if the RTT is high enough, the unnecessary
> rexmits are repeated for all subsequent messages because CoCoA has the
> same problem as default CoAP: it does not apply a larger, backed-off RTO
> from the previous msg exhange to the next msg exchange (which would break
> the chain of unnecessary rexmits).
>
> The new text tries to clarify this.
>
> > - Page 3, last paragraph: "CoCoA being unable to take RTT samples at
> all".
> > In my opinion, "at all" is probably not the right text here, and the
> > phrasing used should be relative to the context of the problem mentioned.
>
> Agreed. This text has now been removed and replaced with the new text
> I mentioned above. Hope it works better.
>
> > - Page 3, last paragraph: similar to my first comment, some adaptation
> > might be needed for this paragraph to better support possible future
> > updates of CoCoA.
>
> Not sure where this (Page 3, last paragraph) refers to, but when we now
> mention the latest CoCoA and cite version -03, is it better (or sufficient
> to address this)?
>
> > - Section 4.1, first line: s/an CoAP/a CoAP
>
> Done.
>
> > - Section 4.1, third line: "normal RTO or FastRTO". I was wondering if
> > handling two terms for the same concept is really useful. But perhaps it
> > may make sense depending on the context in each instance of either term.
> > So feel free to proceed as you prefer.
>
> The new text is now using FastRTO in the context of FASOR.
>
> > - Section 4.1, first paragraph: s/backup mechanisms/backup mechanism
>
> Done.
>
> > - Section 4.1, last paragraph: "FastRTO is updated only with unambiguous
> > RTT samples.  Therefore, it closely tracks the actual RTT of the network
> > and can quickly trigger a retransmission when the network state is not
> > dubious."  Perhaps there may be lossy intervals during which the "actual
> > RTT" might vary (regardless of losses), e.g. due to a change of the
> > physical layer bit rate, but then FASOR would not be able to collect
> > "actual RTT" samples during such interval... It might be interesting to
> > consider this situation as well. If no modification to the FASOR
> algorithm
> > is deemed necessary, then perhaps at least some discussion on this might
> > be useful.
>
> Sure, there may be periods during which no valid samples can be taken and
> the FastRTO remains unchanged. If we would use ambiguous RTT samples,
> there is always the danger of computing (badly) incorrect RTO.
>
> The most importan is to catch up with RTT increase. We think that Slow
> RTO helps there as it makes it much more likely that a new RTT sample is
> optained, if there is no loss. In case the msgs get lost, it should also
> be noted that the FC 6298 algo is notably fast in catching up with
> increased RTT once a RTT sample is finally obtained.
>
> I also think we already have this partly covered in Sec 4.4 and 4.5. Of
> course, without Retransmission Count (or corresponding token) there is
> not much to do without sacrificing the principle of using unambiguous RTT
> samples?
>

Agreed.


>
> > - End of section 4.1. What is "K"? (I wasn't able to find a definition of
> > "K" earlier in the document.)
>
> K is from RFC 6298 like RTTVAR and SRTT that we also use. It
> is now clarified that K = 4 (as in RFC 6298).
>
> > - The document assumes that responses are "acknowledgments". However,
> > there may be responses that do not correspond to acknowledgments at the
> > messages sublayer of CoAP, that still provide RTT samples. It may be good
> > to clarify this, since otherwise perhaps a fraction of the full set of
> RTT
> > samples may remain unused.
>
> Good catch. You are right. A part of the text was written as if
> considering the case of request - piggybacked response exchanges only,
> whereas it should have considered CON - Ack exchanges. Tried to reflect
> this in the modified text.
>
> > - The document also assumes in several instances that the message that
> > triggers some response/ACK/packet from the other endpoint is a "request".
> > Please note that there may be messages that do not carry requests which
> > may anyway be sent as CON messages, from which an RTT sample can be
> > obtained.
>
> Good point.
> Removed unnecessary/misleading "reguests". Hope it works better now.
>

IMO, it does, thanks.


>
> > - Section 4.2, first paragraph. A factor of 1.5 (by default) is mentioned
> > at the end of the paragraph. Is there any hint on how this factor should
> > be set?
>
> We are proposing to use the default. Do you think we should
> recommend it explicitly? We briefly explain the tradeoff; having a
> much smaller factor would maybe not allow enough room for delay increase
> while a larger factor would possibly result in unnecessary conservative
> behaviour. The factor of 1.5 worked well in our experiments.
>
> I'm wondering if there might be scenarios different from yours (those
where you performed your experiments) where 1.5 might not work well enough.


> This is intended to be Experimental, so later experimentation may provide
> more information whether the current advice is still valid or maybe
> the factor should be tuned up or down.
>
>
Agreed. Perhaps some explicit text on this point might help trigger further
experimentation.


> > - Section 4.2, first paragraph, last sentence. Perhaps adding some
> forward
> > reference might help the reader?
>
> Done.
>
> > - Section 4.3.1, FAST_SLOW_FAST: "If the request message needs to be
> > retransmitted, continue by using Slow RTO for the first retransmission in
> > order to respond to congestion".  This text implies that it is *known*
> > that the retransmission is due to congestion, but it could also be due to
> > bad link quality.
>
> Sure, but when discussing congestion, it is one of the fundamental
> principles that a packet loss should be considered as a congestion signal
> in the first place. When it is unknown whether the loss in due to
> congestion or some non-congestion event, a sender should react as if it
> was congestion. However, in the FAST_SLOW_FAST state, FASOR deliberately
> allows for a one-shot probe with a quicker FastRTO to recover from a
> possible non-congestion related loss before using (much longer) Slow RTO
> to respond to congestion.
>
> > - Section 4.3.1, SLOW_FAST: s/transmisssion/transmission
>
> Done.
>
> > - Section 4.3.1: s/if further retransmission are/if further
> > retransmissions are
>
> Done.
>
> > - Section 4.3.1, last paragraph: "if RTO expires also for that request
> > message". When mentioning "that request message", which one does the text
> > refer to? Perhaps, consider rephrasing.
>
> Tried to clarify. Hope it is clearer now.
>

It is!


>
> > I hope this feedback can be useful!
>
> Very helpful, indeed. Thank you!
>
>
Once again, thanks for your updates and for the discussion.

Cheers,

Carles


>
> Cheers,
>
> /Markku
>
> > Cheers,
> >
> > Carles
> >
> >
>