From nobody Fri Dec 18 16:41:41 2020
Return-Path: <kojo@cs.helsinki.fi>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1])
 by ietfa.amsl.com (Postfix) with ESMTP id 1DCA73A09AD;
 Fri, 18 Dec 2020 16:41:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level: 
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, 
 DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
 SPF_PASS=-0.001, 
 URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key)
 header.d=cs.helsinki.fi
Received: from mail.ietf.org ([4.31.198.44])
 by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id O-frcK9GVPYS; Fri, 18 Dec 2020 16:41:37 -0800 (PST)
Received: from script.cs.helsinki.fi (script.cs.helsinki.fi [128.214.11.1])
 (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by ietfa.amsl.com (Postfix) with ESMTPS id 15F223A0965;
 Fri, 18 Dec 2020 16:41:36 -0800 (PST)
X-DKIM: Courier DKIM Filter v0.50+pk-2017-10-25 mail.cs.helsinki.fi Sat,
 19 Dec 2020 02:41:32 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.helsinki.fi;
 h=date:from:to:cc:subject:in-reply-to:message-id:references
 :mime-version:content-type:content-id; s=dkim20130528; bh=cRuHZo
 vc6WA4jOlc0E8Ui+MRYuPqUk9vu2PRLc4kyhA=; b=T3nw6lJFGLw7wSNE/OQ0ma
 obupN9uQrTkdVe9lahDOXbkJ05/Ga00ptHS1YKskR0f/WsevOP2+n4cj12KVwjNJ
 Cf4rIwQcxyXwiobzBf0Atd3UXZTQm5KyibasCDJerMMBYZ3sAF9I2724DA9n3pIx
 yShEeOz/ajdbIZQ1HvrAQ=
Received: from hp8x-60 (85-76-102-128-nat.elisa-mobile.fi [85.76.102.128])
 (AUTH: PLAIN kojo, TLS: TLSv1/SSLv3,256bits,AES256-GCM-SHA384)
 by mail.cs.helsinki.fi with ESMTPSA; Sat, 19 Dec 2020 02:41:32 +0200
 id 00000000005A0403.000000005FDD4C3C.000017F2
Date: Sat, 19 Dec 2020 02:41:31 +0200 (EET)
From: Markku Kojo <kojo@cs.helsinki.fi>
To: Neal Cardwell <ncardwell@google.com>
cc: Martin Duke <martin.h.duke@gmail.com>,
 Yuchung Cheng <ycheng@google.com>, Last Call <last-call@ietf.org>,
 "tcpm@ietf.org Extensions" <tcpm@ietf.org>, draft-ietf-tcpm-rack@ietf.org,
 Michael Tuexen <tuexen@fh-muenster.de>,
 draft-ietf-tcpm-rack.all@ietf.org, tcpm-chairs <tcpm-chairs@ietf.org>
In-Reply-To: <CADVnQy=CvMUsueUysEgg4n7Ba6yWPJa44_GuQZ46CJEQ94sjpw@mail.gmail.com>
Message-ID: <alpine.DEB.2.21.2012182359160.27827@hp8x-60.cs.helsinki.fi>
References: <160557473030.20071.3820294165818082636@ietfa.amsl.com>
 <alpine.DEB.2.21.2012030145440.5180@hp8x-60.cs.helsinki.fi>
 <CAK6E8=diHBZJC5Ei=wKt=j=om1aDcFU8==kSYEtp=KZ4g__+Xg@mail.gmail.com>
 <alpine.DEB.2.21.2012071227390.5180@hp8x-60.cs.helsinki.fi>
 <CAK6E8=fNd3ToWEoCYHwgPG7QUvCXw3kV2rwH=hqmhibQmQNseg@mail.gmail.com>
 <alpine.DEB.2.21.2012081502530.5180@hp8x-60.cs.helsinki.fi>
 <CADVnQykrm1ORm7N+8L0iEyqtJ2rQ1dr1xg+EmYcWQE9nmDX_mA@mail.gmail.com>
 <alpine.DEB.2.21.2012141505360.5844@hp8x-60.cs.helsinki.fi>
 <CAM4esxT9hNqX4Zo+9tMRu9MNEfwuUwebaBFcitj1pCZx_NkqHA@mail.gmail.com>
 <alpine.DEB.2.21.2012160256380.5844@hp8x-60.cs.helsinki.fi>
 <CADVnQy=CvMUsueUysEgg4n7Ba6yWPJa44_GuQZ46CJEQ94sjpw@mail.gmail.com>
User-Agent: Alpine 2.21 (DEB 202 2017-01-01)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=_script-6294-1608338492-0001-2"
Content-ID: <alpine.DEB.2.21.2012190241230.27827@hp8x-60.cs.helsinki.fi>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/sVrv4sqly31ZN9vFRWSH7oPoDCo>
Subject: Re: [tcpm] Last Call:
 <draft-ietf-tcpm-rack-13.txt>(TheRACK-TLPlossdetectionalgorithmfor TCP) to
 Proposed Standard
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>,
 <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>,
 <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Dec 2020 00:41:40 -0000

This is a MIME-formatted message.  If you see this text it means that your
E-mail software does not support MIME-formatted messages.

--=_script-6294-1608338492-0001-2
Content-Type: text/plain; charset="iso-8859-15"; format=flowed
Content-Transfer-Encoding: quoted-printable
Content-ID: <alpine.DEB.2.21.2012190031581.27827@hp8x-60.cs.helsinki.fi>

Hi Neal,

On Fri, 18 Dec 2020, Neal Cardwell wrote:

>=20
>=20
> On Wed, Dec 16, 2020 at 2:39 PM Markku Kojo <kojo@cs.helsinki.fi> wrote=
:
>       > For (2), the RTO timer is still operative so
>       > the RTO recovery rules would still follow.
>
>       In short:
>       When with a non-RACK-TLP implementation timer (RTO) expires: cwnd=
=3D1 MSS,
>       and slow start is entered.
>       When with a RACK_TLP implementation timer (PTO) expires,
>       normal fast recovery is entered (unless implementing
>       also PRR). So no RTO recovery as explicitly stated in Sec. 7.4.1.
>
>       This means that this document explicitly modifies standard TCP co=
ngestion
>       control when there are no acks coming and the retransmission time=
r
>       expires
>
>       from: RTO=3DSRTT+4*RTTVAR (RTO used for arming the timer)
>=20
> It's also worth mentioning this aspect of [RFC6298]:

Sure.

> =A0 =A0(2.4) Whenever RTO is computed, if it is less than 1 second, the=
n the
> =A0 =A0 =A0 =A0 =A0RTO SHOULD be rounded up to 1 second.
> =A0
>       =A0 =A0 =A0 =A01. RTO timer expires
>       =A0 =A0 =A0 =A02. cwnd=3D1 MSS; ssthresh=3DFlightSize/2; rexmit o=
ne segment
>       =A0 =A0 =A0 =A03. Ack of rexmit sent in step 2 arrives
>       =A0 =A0 =A0 =A04. cwnd =3D cwnd+1 MSS; send two segments
>       =A0 =A0 =A0 =A0...
>
>       to:=A0 =A0PTO=3Dmin(2*SRTT,RTO) (PRO used for arming the timer)
>       =A0 =A0 =A0 =A01. PTO times expires
>       =A0 =A0 =A0 =A02. (cwnd=3D1 MSS); (re)xmit one segment
>=20
>=20
> It may be worthwhile to point out here that the RACK-TLP draft does not =
specify setting cwnd
> to 1 at this point, and the Linux TCP implementation from our team does =
not do this. The

Yes, that's why I put it in parenthesis. In my view the RACK-TLP=20
draft implicitly limits cwnd to one segment by allowing just one TLP=20
probe segment.

> rationale is that at this point there is no solid evidence that anythin=
g has been lost, and
> setting cwnd to 1 at this point would make the algorithm more timid tha=
n the preceding
> approaches, for no good reason.

Sure, no need to set cwnd at this point.

A good reason could be: No feedback, Ack clock lost? But, of course,=20
it is too early even though after the arrival of ack the sender may well=20
modify cwnd again. Like it now does, if it decides it was loss other than=20
probe segment.
  =A0
>       =A0 =A0 =A0 =A03. Ack of (re)xmit sent in srep 2 arrives
>       =A0 =A0 =A0 =A04. cwnd =3D ssthresh =3D FlightSize/2; send N=3Dcw=
nd segments
>=20
>=20
> That step (4) assumes a particular congestion control implementation th=
at is different than
> what we would recommend.

Ok. I just used the Standards Track formula as does the RACK-TLP draft in=20
its examples. And because RACK-TLP draft states it does not modify=20
current congestion control.

>       For example, if FlightSize is 100 segments when timer expires,
>       congestion control is the same in steps 1-3, but in step 4 the
>       current standard congestion control allows transmitting 2 segment=
s,
>       while RACK-TLP would allow blasting 50 segments.
>
>       Question is: what is the justification to modify standard TCP
>       congestion control to use fast recovery instead of slow start for =
a
>       case where timeout is needed to detect loss because there is no
>       feedback and ack clock is lost? The draft does not give any
>       justification. This clearly is in conflict with items (0) and (1)
>       in BCP 133 (RFC 5033).
>=20
>=20
> The draft pointedly does not modify standard TCP congestion control.
>=20
> RACK-TLP does not specify using fast recovery instead of slow start for =
a=A0 case where timeout
> is needed to detect loss because there is no=A0 feedback and the ACK cl=
ock is lost. Rather,
> RACK-TLP only triggers fast recovery if there *is* ACK feedback providi=
ng an ACK clock and
> strong evidence of a packet loss.

So here our views diverge. In the above steps I decoupled congestion=20
control from what segments are sent (rexmit and xmit are mentioned there=20
just as comments to check what is going on, they can be freely removed).
Congestion control governs how many segments can be sent.

In my view, when there is no feedback RACK TLP uses timeout (PTO) to help=20
make progress. Without the timeout it cannot make progress. Just like=20
an RFC 5681 sender, it cannot make progress until timeout expires.=20
So this should be taken as the criteria to (effectively) enter slow start=
,=20
once loss becomes detected.

Or, at least I don't see any difference why different timeout value would=20
change the congestion control.

When timeout expires RACK-TLP sends one segment (just like an RFC 5681=20
sender when RTO expires). The only difference is that RFC 5681 sender=20
selects a different segment (first unacknowledged segment) to retransmit=20
"blindly" in order to get feedback and start ACK clock. RACK-TLP sends=20
"blindly" the last segment from the retransmission queue (or a new=20
segment). Selecting a different segment for transmission upon timeout=20
does not change anything, in my view. In both cases it is a "blind"=20
selection; the sender does not know what was lost. And in both cases the=20
ACK for this one segment provides feedback about what potentially has=20
been lost. There the only difference is that the segment that RACK-TLP=20
selected to transmit is a better choice when SACK option is use because=20
it provides more information.

If there is some difference in that the ACK for RACK-TLP provides=20
stronger evidence for packet loss (and what was lost), then it should be=20
also ok to modify the current standard TCP congestion control such that=20
upon RTO timeout the sender does not select the first unacknowledged=20
segment for blind retransmission but the last segment in the=20
retransmission queue (or maybe a new segment). With SACK this would=20
provide exactly the same information as TLP probe does. And, upon arrival=20
of the first ACK, RTO recovery would use similar rules as in RACK-TLP to=20
better decide whether it was spurious RTO or loss and move from slow=20
start to fast recovery and set cwnd=3Dssthresh.

I really don't see how this change in "blindly" retrasmitted first segmen=
t=20
in slow start would allow modifying congestion control for RTO recovery.

> The main aspect of triggering loss recovery that is new is the approach =
of allowing a sender
> to transmit one additional "probe" segment in flight after 2*SRTT. Once =
this is accepted, the
> rest of the recovery process essentially follows from principles alread=
y generally accepted
> in the IETF TCP community.

Could you please see above and explain (or provide a pointer to an RFC)=20
what are those "principles already generally accepted in the IETF TCP=20
community". That would help me to understand your point.

> Put another way, it seems to me that if one is to object to TLP-trigger=
ed fast recovery, then
> the objection must be mounted specifically against the permission grant=
ed to the sender to
> transmit one additional "probe" segment in flight after 2*SRTT. Once th=
at permission is
> granted, there is nothing really new about TLP-triggered fast recovery.

I am sorry but I still fail to see what is the preceding evidence that=20
makes this not new. A pointer could help.

In my view the probe is not anything to object as long as it is not=20
considered as a cwnd increase in the later cwnd&ssthresh calculation=20
(a minor detail, but someone might later suggest first two then 4 and so=20
on probe segments with the justufication that it is just one more than=20
earlier).

>       Furthermore, there is no implementation nor experimental experien=
ce
>       evaluating this change. The implementation with experimental expe=
rience
>       uses PRR (RFC 6937) which is an Experimental specification includ=
ing a
>       novel "trick" that directs PRR fast recovery to effectively use s=
low
>       start in this case at hand.
>=20
>=20
> What do you think of Yuchung's latest suggestion for new text in "9.3.=A0 =
Interaction with
> congestion control" suggested by Yuchung Thursday afternoon (Dec 17), w=
hich explicitly
> recommends PRR? As mentioned earlier in this thread, there is considera=
ble implementation and
> experimental experience with RACK-TLP plus PRR since the Linux TCP stac=
k has been using
> RACK-TLP with PRR as the default=A0loss=A0recovery algorithm since Linu=
x v4.18 in August 2018.

As I have already indicated, in my view PRR does not have the problem we=20
are discussing here because PRR-SSRB makes fast recovery to behave like=20
slow start. And PRR-CRB is even more conservative. So it would be a safe=20
choice for this problem unlike the current RFC 6675 algorithm.

In other words, I only object allowing the use of RACK-TLP with the=20
RFC 6675 congestion control algorithm unmodified because it does not have=20
a safeguard like PRR. This does not mean that RACK-TLP document would=20
need to include the necessary modifications to the RFC 6675 algorithm.

I don't know processwise but PRR possibly cannot be used as normative=20
requirement because it is currently Experimental? Not quite sure though.

Best regards,

/Markku

> The exact commit is:
>=20
> =A0 b38a51fec1c1 tcp:=A0disable=A0RFC6675=A0loss=A0detection
>=20
> best,
> neal
--=_script-6294-1608338492-0001-2--