Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-tcpm-rack-13.txt>(TheRACK-TLPlossdetectionalgorithmforTCP) to Proposed Standard

Praveen Balasubramanian <pravb@microsoft.com> Thu, 17 December 2020 18:52 UTC

From: Praveen Balasubramanian <pravb@microsoft.com>
To: "martin.h.duke@gmail.com" <martin.h.duke@gmail.com>, "kojo@cs.helsinki.fi" <kojo@cs.helsinki.fi>
CC: "tcpm@ietf.org" <tcpm@ietf.org>, "draft-ietf-tcpm-rack@ietf.org" <draft-ietf-tcpm-rack@ietf.org>, "tuexen@fh-muenster.de" <tuexen@fh-muenster.de>, "draft-ietf-tcpm-rack.all@ietf.org" <draft-ietf-tcpm-rack.all@ietf.org>, "last-call@ietf.org" <last-call@ietf.org>, "tcpm-chairs@ietf.org" <tcpm-chairs@ietf.org>
Thread-Topic: [EXTERNAL] Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.txt>(TheRACK-TLPlossdetectionalgorithmforTCP) to Proposed Standard
Thread-Index: AQHW1FF01btI7SQGVkejFRzXPAuBhKn7amWAgAA3ZGA=
Date: Thu, 17 Dec 2020 18:51:46 +0000
Message-ID: <CY1PR00MB0172182657354535DF24E790B6C49@CY1PR00MB0172.namprd00.prod.outlook.com>
References: <160557473030.20071.3820294165818082636@ietfa.amsl.com> <alpine.DEB.2.21.2012030145440.5180@hp8x-60.cs.helsinki.fi> <CAK6E8=diHBZJC5Ei=wKt=j=om1aDcFU8==kSYEtp=KZ4g__+Xg@mail.gmail.com> <alpine.DEB.2.21.2012071227390.5180@hp8x-60.cs.helsinki.fi> <CAK6E8=fNd3ToWEoCYHwgPG7QUvCXw3kV2rwH=hqmhibQmQNseg@mail.gmail.com> <alpine.DEB.2.21.2012081502530.5180@hp8x-60.cs.helsinki.fi> <CADVnQykrm1ORm7N+8L0iEyqtJ2rQ1dr1xg+EmYcWQE9nmDX_mA@mail.gmail.com> <alpine.DEB.2.21.2012141505360.5844@hp8x-60.cs.helsinki.fi> <CAM4esxT9hNqX4Zo+9tMRu9MNEfwuUwebaBFcitj1pCZx_NkqHA@mail.gmail.com> <alpine.DEB.2.21.2012160256380.5844@hp8x-60.cs.helsinki.fi> <CAM4esxRDrFZAYBS4exaQFFj6Djwe6KHrzMEtGvOhscpoxk3RQA@mail.gmail.com> <alpine.DEB.2.21.2012162339560.5844@hp8x-60.cs.helsinki.fi> <CAM4esxRQjuzo4u9oUN2CDC1vbeFxmSarjBLqpboatjWouiL37Q@mail.gmail.com> <CAM4esxQ67K9kcaWwNot2DfJpCe8ShOngXogxKU=KXZJGn+pbXg@mail.gmail.com> <alpine.DEB.2.21.2012171019160.5844@hp8x-60.cs.helsinki.fi> <CAM4esxTvTjvVk5hE0z5UnLBdKv04UC+daRBxsnnZ1qJTa=gSgw@mail.gmail.com>
In-Reply-To: <CAM4esxTvTjvVk5hE0z5UnLBdKv04UC+daRBxsnnZ1qJTa=gSgw@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
msip_labels: MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_ActionId=09440963-fd55-4d9d-9a4e-31f9b55d9629; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_ContentBits=0; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Enabled=true; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Method=Standard; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Name=Internal; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SetDate=2020-12-17T18:48:29Z; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SiteId=72f988bf-86f1-41af-91ab-2d7cd011db47;
authentication-results: gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=microsoft.com;
x-originating-ip: [2001:4898:80e8:9:95cf:3f73:f682:d302]
x-ms-publictraffictype: Email
x-ms-office365-filtering-ht: Tenant
x-ms-office365-filtering-correlation-id: c873fe82-ef30-46db-d242-08d8a2bccde7
x-ms-traffictypediagnostic: CY1PR00MB0089:
x-microsoft-antispam-prvs: <CY1PR00MB0089224171237FE014EE6209B6C49@CY1PR00MB0089.namprd00.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:8882;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: qe2zaublk/yf/0AVwz6+BFWhn+96gkIu72JxjSWADlXqMZ6UfY+Q2Vdvsd4HBfORu89Xt9gcnueF/FoDfuXqBnCcU/kPfq0VCXofaLPTG7WkgQyHXmrrGhTBiQ2oMXghbM9fI3DyCQ+dUeafD2lhP6mu7VklCT3Grs7/YmV/yIVZI/sJZauonbnDuhOC5SuO7XNjIxuzM/6ptEBdVVdH216/I91onIE3vWXhOWpVdBW9QvM+y//KKk+dl4p3BqAHs7EZhR4xjRiyBHQi7dghxpy6Uu80GOIDxUAgPZX+rnrXVr0qrU2HoyQBjztqq8JgcdQF/yCfCzINbBxUiO/zPp2AKsOCm6rmKaAbh5lzmNqq4tNJqjUer0o4wsH5FEJ0EM18fLE2KB/0BnugptOXW7orO9tVJ0efreuxZylIT7S15Yvpa+usaj7HkEf/YsxvHMqb3SAe8JpSvcMfRUXBHw==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CY1PR00MB0172.namprd00.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(39860400002)(396003)(376002)(136003)(346002)(71200400001)(66556008)(186003)(30864003)(66446008)(8936002)(10290500003)(7696005)(9686003)(76116006)(4326008)(2906002)(55016002)(52536014)(8990500004)(6506007)(33656002)(110136005)(8676002)(5660300002)(66946007)(53546011)(66476007)(82960400001)(82950400001)(83380400001)(54906003)(316002)(478600001)(64756008)(86362001)(61000200002)(579004); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata: 5AxFF8T4cEwYNjQwPeLzLaZ9yT3Ye3HPLUQrjZIYTVBAPRcZ0s7+Qa9JGE8caLQ3LbaYZSz4YIIbl7wEUxlrTlP3ZRucLN7W9li/BIyASVqkphhDe9s0jGxaPNePlZxy1gYdGC59QrX56ZUF163nv3B+/Pc4eBqfYgnJldg2I2EvZhiYdmCDd3I+be0Mh3v4dGWDL4giR35HgzjvUfUmD7IoKtlPaVGcGjK3B2au5hHka/GZp31lvUx6CKp0vPyUWtEDT0u+6ESQuz6S2vUJFm2FGbedq/S1IpapdtKOKUXPT9Cvl88lzx0Uiu2slfYzSQMTyjLcaL0btO7esUlhH6ItDUeAvEm/BdeZjl8kVUv7h4bgvfEznV+j3Ic9Dd0pnElk8YYFHeo7pKMRixp6TLPb+ei9nQE1ovJ+zuf6V+ZIu7sfb9FRAy2vXUraLzU6CeXShm0BoG34pJP+0ZFheaNOv4uCVL8H9hOsrOEnE8KAqKFJxUCGep+60hLj0Rw2fYFFEz7j16VUdLTm4SzUn+Htmy9r3Zg4wvlEteKlZDRwb5l7FHn0eoeHeDN7XJ1GsojPhq7qPs0I9imM5s1/WstSmY65nqdK2gAXNc11B9PmX7GWsuF13S9eihRTp6mhc5vIHftujJ88YFHRXGaQQE33C200eW7PxdtD950Y6DIVoUreBlc+OlEfi0M9hEj4tVz/bnGCbKwb+7Kt3WleECEsarC4n9mEXtb+fCMiPOfw3CTN9Ic8avrKh0ub1xCfVLjPb1yy7wrf3FJFY1067a53Xvw4lFpQoFpmzF/XOcUPLdjiw8uSqyQWUr13anHkfYw4jwG48oJOnvidILTPvD8VTEZk3eOskFEMXtJlv2jzls1REF9n0SFSmZQTUtYqr2mDH0BNe7zYWsDp2u0QHL5N3imzSnSZZC0FkczFthID+tyEbJ5Tc9+c5P6+QzysgNOhOzbWDJeEqDQqlJtfAKsSL7fSc7TY5nZZhUFZxZFw8dnccV1rtAcWmVgg74vpZLs8OIvN0vkWJb66S2KJAJKkgGhzdiO8F0bmyOVXcWpJUrtDJNeOwHH9j2FO7SGJFqzcgfFcWdgm4LX+suWWfxEQGZqZMfjbhQ1BhdDW7I13P2+0WcjafK+Xg465MXXb
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_CY1PR00MB0172182657354535DF24E790B6C49CY1PR00MB0172namp_"
MIME-Version: 1.0
X-OriginatorOrg: microsoft.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: CY1PR00MB0172.namprd00.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: c873fe82-ef30-46db-d242-08d8a2bccde7
X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Dec 2020 18:51:46.4308 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: tqbV+2BoDenk6l+vDbTdaL/jGP32ctTNtFD2cea4t6I807tLMEBhvcAzgZYZzJBfFojvhpU60yEsYlvQnIIqJRkwRHjGnF1yGrz+LLlrplQ=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR00MB0089
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/msHq_HuwYlpjE4v6lnpnSFpshvk>
Subject: Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-tcpm-rack-13.txt>(TheRACK-TLPlossdetectionalgorithmforTCP) to Proposed Standard
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Dec 2020 18:52:06 -0000

I agree that we should have a note in this RFC about congestion control action upon detecting lost retransmission(s).

From: tcpm <tcpm-bounces@ietf.org> On Behalf Of Martin Duke
Sent: Thursday, December 17, 2020 7:30 AM
To: Markku Kojo <kojo@cs.helsinki.fi>
Cc: tcpm@ietf.org Extensions <tcpm@ietf.org>; draft-ietf-tcpm-rack@ietf.org; Michael Tuexen <tuexen@fh-muenster.de>; draft-ietf-tcpm-rack.all@ietf.org; Last Call <last-call@ietf.org>; tcpm-chairs <tcpm-chairs@ietf.org>
Subject: [EXTERNAL] Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.txt>(TheRACK-TLPlossdetectionalgorithmforTCP) to Proposed Standard

Hi Markku,

Thanks, now I understand your objections.

Martin

On Thu, Dec 17, 2020 at 12:46 AM Markku Kojo <kojo@cs.helsinki.fi<mailto:kojo@cs.helsinki.fi>> wrote:
Hi,

On Wed, 16 Dec 2020, Martin Duke wrote:

> I spent a little longer looking at the specs more carefully, and I explained (1)
> incorrectly in my last two messages. P21..29 are not Limited Transmit packets.

Correct. Just normal the rule that allows sending new data during fast
recovery.

> However, unless I'm missing something else, 6675 is clear that the recovery period
> does not end until the cumulative ack advances, meaning that detecting the lost
> retransmission of P1 does not trigger another MD directly.

As I have said earlier, RFC 6675 does not repeat all congestion control
principles from RFC 5681. It definitely honors the CC principle that
requires to treat a loss of a retransmission as a new congestion
indication and another MD. I believe I am obligated to know this as a
co-author of RFC 6675. ;)

RFC 6675 explicitly indicates that it follows RFC 5681 by stating in the
abstract:

" ... conforms to the spirit of the current congestion control
  specification (RFC 5681 ..."

And in the intro:

   "The algorithm specified in this document is a straightforward
    SACK-based loss recovery strategy that follows the  guidelines
    set in [RFC5681] ..."

I don't think there is anything unclear in this.

RFC 6675 and all other standard congestion controls (RFC 5581 and RFC
6582) handle a loss of a retransmission by "enforcing" RTO to detect it.
And RTO guarantees MD. RACK-TLP changes the loss detection in this case
and therefore the standard congestion control algorithms do not have
actions to handle it corrrectly. That is the point.

BR,

/Markku

> Thanks for this exercise! It's refreshed my memory of these details after working
> on slightly different QUIC algorithms a long time.
>
> On Wed, Dec 16, 2020, 18:55 Martin Duke <martin.h.duke@gmail.com<mailto:martin.h.duke@gmail.com>> wrote:
> (1) Flightsize: in RFC 6675. Section 5, Step 4.2:
>
>        (4.2) ssthresh = cwnd = (FlightSize / 2)
>
>              The congestion window (cwnd) and slow start threshold
>              (ssthresh) are reduced to half of FlightSize per [RFC5681].
>              Additionally, note that [RFC5681] requires that any
>              segments sent as part of the Limited Transmit mechanism not
>              be counted in FlightSize for the purpose of the above
>              equation.
>
> IIUC the segments P21..P29 in your example were sent because of Limited
> Transmit, and so don't count. The flightsize for the purposes of (4.2) is
> therefore 20 after both losses, and the cwnd does not go up on the second
> loss.
>
> (2)
> " Even a single shot burst every time there is significant loss
> event is not acceptable, not to mention continuous aggressiveness, and
> this is exactly what RFC 2914 and RFC 5033 explicitly address and warn
> about."
>
> "Significant loss event" is the key phrase here. The intent of TLP/PTO is to
> equalize the treatment of a small packet loss whether it happened in the
> middle of a burst or the end. Why should an isolated loss be treated
> differently based on its position in the burst? This is just a logical
> extension of fast retransmit, which also modified the RTO paradigm. The
> working group consensus is that this is a feature, not a bug; you're welcome
> to feel otherwise but I suspect you're in the rough here.
>
> Regards
> Martin
>
>
> On Wed, Dec 16, 2020 at 4:11 PM Markku Kojo <kojo@cs.helsinki.fi<mailto:kojo@cs.helsinki.fi>> wrote:
>       Hi Martin,
>
>       See inline.
>
>       On Wed, 16 Dec 2020, Martin Duke wrote:
>
>       > Hi Markku,
>       >
>       > There is a ton here, but I'll try to address the top points.
>       Hopefully
>       > they obviate the rest.
>
>       Sorry for being verbose. I tried to be clear but you actually
>       removed my
>       key issues/questions ;)
>
>       > 1.
>       > [Markku]
>       > "Hmm, not sure what you mean by "this is a new loss detection
>       after
>       > acknowledgment of new data"?
>       > But anyway, RFC 5681 gives the general principle to reduce
>       cwnd and
>       > ssthresh twice if a retransmission is lost but IMHO (and I
>       believe many
>       > who have designed new loss recovery and CC algorithms or
>       implemented
>       > them
>       > agree) that it is hard to get things right if only congestion
>       control
>       > principles are available and no algorithm."
>       >
>       > [Martin]
>       > So 6675 Sec 5 is quite explicit that there is only one cwnd
>       reduction
>       > per fast recovery episode, which ends once new data has been
>       > acknowledged.
>
>       To be more precise: fast recovery ends when the current window
>       becomes
>       cumulatively acknowledged, that is,
>
>       (4.1) RecoveryPoint (= HighData at the beginning) becomes
>       acknowledged
>
>       I believe we agree and you meant this although new data below
>       RecoveryPoint may become cumulatively acknowledged already
>       earlier
>       during the fast recovery. Reno loss recovery in RFC 5681 ends,
>       when
>       (any) new data has been acknowledged.
>
>       > By definition, if a retransmission is lost it is because
>       > newer data has been acknowledged, so it's a new recovery
>       episode.
>
>       Not sure where you have this definition? Newer than what are you
>       referring to?
>
>       But, yes, if a retransmission is lost with RFC 6675 algorithm,
>       it requires RTO to be detected and definitely starts a new
>       recovery
>       episode. That is, a new recovery episode is enforced by step
>       (1.a) of
>       NextSeg () which prevents retransmission if a segment that has
>       already
>       been retransmitted. If RACK-TLP is used for detecting loss with
>       RFC 6675
>       things get different in many ways, because it may detect loss of
>       a
>       retransmission. It would pretty much require an entire redesign
>       of the algorith. For example, calculation of pipe does not
>       consider
>       segments that have been retransmitted more than once.
>
>       > Meanwhile, during the Fast Recovery period the incoming acks
>       implicitly
>       > remove data from the network and therefore keep flightsize
>       low.
>
>       Incorrect. FlightSize != pipe. Only cumulative acks remove data
>       from
>       FlightSize and new data transmitted during fast recovery inflate
>       FlightSize. How FlightSize evolves depends on loss pattern as I
>       said.
>       It is also possible that FlightSize is low, it may err in both
>       directions. A simple example can be used as a proof for the case
>       where
>       cwnd increases if a loss of retransmission is detected and
>       repaired:
>
>       RFC 6675 recovery with RACK-TLP loss detection:
>       (contains some inaccuracies because it has not been defined how
>       lost rexmits are calculated into pipe)
>
>       cwnd=20; packets P1,...,P20 in flight = current window of data
>       [P1 dropped and rexmit of P1 will also be dropped]
>
>       DupAck w/SACK for P2 arrives
>       [loss of P1 detected after one RTT from original xmit of P1]
>       [cwnd=ssthresh=10]
>       P1 is rexmitted (and it logically starts next window of data)
>
>       DupAcks w/ SACK for original P3..11 arrive
>       DupAck w/ SACK for original P12 arrives
>       [cwnd-pipe = 10-9 >=1]
>       send P21
>       DupAck w/SACK for P13 arrives
>       send P22
>       ...
>       DupAck w/SACK for P20 arrives
>       send P29
>       [FlightSize=29]
>
>       (Ack for rexmit of P1 would arrive here unless it got dropped)
>
>       DupAck w/SACK for P21 arrives
>       [loss of rexmit P1 detected after one RTT from rexmit of P1]
>
>       SET cwnd = ssthresh = FlightSize/2= 29/2 = 14,5
>
>       CWND INCREASES when it should be at most 5 after halving it
>       twice!!!
>
>       > We can continue to go around on our interpretation of these
>       documents,
>       > but fundamentally if there is ambiguity in 5681/6675 we should
>       bis
>       > those RFCs rather than expand the scope of RACK.
>
>       As I said earlier, I am not opposing bis, though 5681bis wuold
>       not
>       be needed, I think.
>
>       But let me repeat: if we publish RACK-TLP now without necessary
>       warnings
>       or with a correct congesion control algorithm someone will try
>       to
>       implement RACK-TLP with RFC 6675 and it will be a total mesh.
>       The
>       behavior will be unpredictable and quite likely unsafe
>       congestion
>       control behavior.
>
>       > 2.
>       > [Markku]
>       > " In short:
>       > When with a non-RACK-TLP implementation timer (RTO) expires:
>       cwnd=1
>       > MSS,
>       > and slow start is entered.
>       > When with a RACK_TLP implementation timer (PTO) expires,
>       > normal fast recovery is entered (unless implementing
>       > also PRR). So no RTO recovery as explicitly stated in Sec.
>       7.4.1."
>       >
>       > [Martin]
>       > There may be a misunderstanding here. PTO is not the same as
>       RTO, and
>       > both mechanisms exist! The loss response to a PTO is to send a
>       probe;
>       > the RTO response is as with conventional TCP. In Section 7.3:
>
>       No, I don't think I misunderstood. If you call timeout with
>       another name, it is still timeout. And congestion control does
>       not
>       consider which segments to send (SND.UNA vs. probe w/ higher
>       sequence
>       number), only how much is sent.
>
>       You ignored my major point where I decoupled congestion control
>       from loss
>       detection and loss recovery and compared RFC 5681 behavior to
>       RACK-TLP
>       behavior in exactly the same scenario where an entire flight is
>       lost and
>       timer expires.
>
>       Please comment why congestion control behavior is allowed to be
>       radically
>       different in these two implementations?
>
>       RFC 5681 & RFC 6298 timeout:
>
>               RTO=SRTT+4*RTTVAR (RTO used for arming the timer)
>              1. RTO timer expires
>              2. cwnd=1 MSS; ssthresh=FlightSize/2; rexmit one segment
>              3. Ack of rexmit sent in step 2 arrives
>              4. cwnd = cwnd+1 MSS; send two segments
>              ...
>
>       RACK-TLP timeout:
>
>               PTO=min(2*SRTT,RTO) (PTO used for arming the timer)
>              1. PTO times expires
>              2. (cwnd=1 MSS); (re)xmit one segment
>              3. Ack of (re)xmit sent in srep 2 arrives
>              4. cwnd = ssthresh = FlightSize/2; send N=cwnd segments
>
>       If FlightSize is 100 segments when timer expires, congestion
>       control is
>       the same in steps 1-3, but in step 4 the standard congestion
>       control
>       allows transmitting 2 segments, while RACK-TLP would allow
>       blasting 50 segments.
>
>       > After attempting to send a loss probe, regardless of whether a
>       loss
>       >    probe was sent, the sender MUST re-arm the RTO timer, not
>       the PTO
>       >    timer, if FlightSize is not zero.  This ensures RTO
>       recovery remains
>       >    the last resort if TLP fails.
>       > "
>
>       This does not prevent the above RACK-TLP behavior from getting
>       realized.
>
>       > So a pure RTO response exists in the case of persistent
>       congestion that
>       > causes losses of probes or their ACKs.
>
>       Yes, RTO response exists BUT only after RACK-TLP at least once
>       blasts the
>       network. It may well be that with smaller windows RACK-TLP is
>       successful
>       during its TLP initiated overly aggressive "fast recovery" and
>       never
>       enters RTO recovery because it may detect and repair also loss
>       of
>       rexmits. That is, it continues at too high rate even if lost
>       rexmits
>       indicate that congestion persists in successive windows of data.
>       And
>       worse, it is successful because it pushes away other compatible
>       TCP
>       flows by being too aggressive and unfair.
>
>       Even a single shot burst every time there is significant loss
>       event is not acceptable, not to mention continuous
>       aggressiveness, and
>       this is exactly what RFC 2914 and RFC 5033 explicitly address
>       and warn
>       about.
>
>       Are we ignoring these BCPs that have IETF consensus?
>
>       And the other important question I'd like to have an answer:
>
>       What is the justification to modify standard TCP congestion
>       control to
>       use fast recovery instead of slow start for a case where timeout
>       is
>       needed to detect the packet losses because there is no feedback
>       and ack
>       clock is lost? RACK-TLP explicitly instructs to do so in Sec.
>       7.4.1.
>
>       As I noted: based on what is written in the draft it does not
>       intend to
>       change congestion control but effectively it does.
>
>       /Markku
>
>       > Martin
>       >
>       >
>       > On Wed, Dec 16, 2020 at 11:39 AM Markku Kojo
>       <kojo@cs.helsinki.fi<mailto:kojo@cs.helsinki.fi>>
>       > wrote:
>       >       Hi Martin,
>       >
>       >       On Tue, 15 Dec 2020, Martin Duke wrote:
>       >
>       >       > Hi Markku,
>       >       >
>       >       > Thanks for the comments. The authors will incorporate
>       >       many of your
>       >       > suggestions after the IESG review.
>       >       >
>       >       > There's one thing I don't understand in your comments:
>       >       >
>       >       > " That is,
>       >       > where can an implementer find advice for correct
>       >       congestion control
>       >       > actions with RACK-TLP, when:
>       >       >
>       >       > (1) a loss of rexmitted segment is detected
>       >       > (2) an entire flight of data gets dropped (and
>       detected),
>       >       >      that is, when there is no feedback available and
>       a
>       >       timeout
>       >       >      is needed to detect the loss "
>       >       >
>       >       > Section 9.3 is the discussion about CC, and is clear
>       that
>       >       the
>       >       > implementer should use either 5681 or 6937.
>       >
>       >       Just a cite nit: RFC 5681 provides basic CC concepts and
>       >       some useful CC
>       >       guidelines but given that RACK-TLP MUST implement SACK
>       the
>       >       algorithm in
>       >       RFC 5681 is not that useful and an implementer quite
>       likely
>       >       follows
>       >       mainly the algorithm in RFC 6675 (and not RFC 6937 at
>       all
>       >       if not
>       >       implementing PRR).
>       >       And RFC 6675 is not mentioned in Sec 9.3, though it is
>       >       listed in the
>       >       Sec. 4 (Requirements).
>       >
>       >       > You went through the 6937 case in detail.
>       >
>       >       Yes, but without correct CC actions.
>       >
>       >       > If 5681, it's pretty clear to me that in (1) this is a
>       >       new loss
>       >       > detection after acknowledgment of new data, and
>       therefore
>       >       requires a
>       >       > second halving of cwnd.
>       >
>       >       Hmm, not sure what you mean by "this is a new loss
>       >       detection after
>       >       acknowledgment of new data"?
>       >       But anyway, RFC 5681 gives the general principle to
>       reduce
>       >       cwnd and
>       >       ssthresh twice if a retransmission is lost but IMHO (and
>       I
>       >       believe many
>       >       who have designed new loss recovery and CC algorithms or
>       >       implemented them
>       >       agree) that it is hard to get things right if only
>       >       congestion control
>       >       principles are available and no algorithm.
>       >       That's why ALL mechanisms that we have include a quite
>       >       detailed algorithm
>       >       with all necessary variables and actions for loss
>       recovery
>       >       and/or CC
>       >       purposes (and often also pseudocode). Like this document
>       >       does for loss
>       >       detection.
>       >
>       >       So the problem is that we do not have a detailed enough
>       >       algorithm or
>       >       rule that tells exactly what to do when a loss of rexmit
>       is
>       >       detected.
>       >       Even worse, the algorithms in RFC 5681 and RFC 6675
>       refer
>       >       to
>       >       equation (4) of RFC 5681 to reduce ssthresh and cwnd
>       when a
>       >       loss
>       >       requiring a congestion control action is detected:
>       >
>       >         (cwnd =) ssthresh = FlightSize / 2)
>       >
>       >       And RFC 5681 gives a warning not to halve cwnd in the
>       >       equation but
>       >       FlightSize.
>       >
>       >       That is, this equation is what an implementer
>       intuitively
>       >       would use
>       >       when reading the relevant RFCs but it gives a wrong
>       result
>       >       for
>       >       outstanding data when in fast recovery (when the sender
>       is
>       >       in
>       >       congestion avoidance and the equation (4) is used to
>       halve
>       >       cwnd, it
>       >       gives a correct result).
>       >       More precisely, during fast recovery FlightSize is
>       inflated
>       >       when new
>       >       data is sent and reduced when segments are cumulatively
>       >       Acked.
>       >       What the outcome is depends on the loss pattern. In the
>       >       worst case,
>       >       FlightSize is signficantly larger than in the beginning
>       of
>       >       the fast
>       >       recovery when FlightSize was (correctly) used to
>       determine
>       >       the halved
>       >       value for cwnd and ssthresh, i.e., equation (4) may
>       result
>       >       in
>       >       *increasing* cwnd upon detecting a loss of a rexmitted
>       >       segment, instead
>       >       of further halving it.
>       >
>       >       A clever implementer might have no problem to have it
>       right
>       >       with some
>       >       thinking but I am afraid that there will be incorrect
>       >       implementations
>       >       with what is currently specified. Not all implementers
>       have
>       >       spent
>       >       signicicant fraction of their career in solving TCP
>       >       peculiarities.
>       >
>       >       > For (2), the RTO timer is still operative so
>       >       > the RTO recovery rules would still follow.
>       >
>       >       In short:
>       >       When with a non-RACK-TLP implementation timer (RTO)
>       >       expires: cwnd=1 MSS,
>       >       and slow start is entered.
>       >       When with a RACK_TLP implementation timer (PTO) expires,
>       >       normal fast recovery is entered (unless implementing
>       >       also PRR). So no RTO recovery as explicitly stated in
>       Sec.
>       >       7.4.1.
>       >
>       >       This means that this document explicitly modifies
>       standard
>       >       TCP congestion
>       >       control when there are no acks coming and the
>       >       retransmission timer
>       >       expires
>       >
>       >       from: RTO=SRTT+4*RTTVAR (RTO used for arming the timer)
>       >              1. RTO timer expires
>       >              2. cwnd=1 MSS; ssthresh=FlightSize/2; rexmit one
>       >       segment
>       >              3. Ack of rexmit sent in step 2 arrives
>       >              4. cwnd = cwnd+1 MSS; send two segments
>       >              ...
>       >
>       >       to:   PTO=min(2*SRTT,RTO) (PRO used for arming the
>       timer)
>       >              1. PTO times expires
>       >              2. (cwnd=1 MSS); (re)xmit one segment
>       >              3. Ack of (re)xmit sent in srep 2 arrives
>       >              4. cwnd = ssthresh = FlightSize/2; send N=cwnd
>       >       segments
>       >
>       >       For example, if FlightSize is 100 segments when timer
>       >       expires,
>       >       congestion control is the same in steps 1-3, but in step
>       4
>       >       the
>       >       current standard congestion control allows transmitting
>       2
>       >       segments,
>       >       while RACK-TLP would allow blasting 50 segments.
>       >
>       >       Question is: what is the justification to modify
>       standard
>       >       TCP
>       >       congestion control to use fast recovery instead of slow
>       >       start for a
>       >       case where timeout is needed to detect loss because
>       there
>       >       is no
>       >       feedback and ack clock is lost? The draft does not give
>       any
>       >       justification. This clearly is in conflict with items
>       (0)
>       >       and (1)
>       >       in BCP 133 (RFC 5033).
>       >
>       >       Furthermore, there is no implementation nor experimental
>       >       experience
>       >       evaluating this change. The implementation with
>       >       experimental experience
>       >       uses PRR (RFC 6937) which is an Experimental
>       specification
>       >       including a
>       >       novel "trick" that directs PRR fast recovery to
>       effectively
>       >       use slow
>       >       start in this case at hand.
>       >
>       >
>       >       > In other words, I am not seeing a case that requires
>       new
>       >       congestion
>       >       > control concepts except as discussed in 9.3.
>       >
>       >       See above. The change in standard congestion control for
>       >       (2).
>       >       The draft intends not to change congestion control but
>       >       effectively it
>       >       does without any operational evidence.
>       >
>       >       What's also is missing and would be very useful:
>       >
>       >       - For (1), a hint for an implementer saying that because
>       >       RACK-TLP is
>       >          able to detect a loss of a rexmit unlike any other
>       loss
>       >       detection
>       >          algorithm, the sender MUST react twice to congestion
>       >       (and cite
>       >          RFC 5681). And cite a document where necessary
>       correct
>       >       actions
>       >          are described.
>       >
>       >       - For (1), advise that an implementer needs to keep
>       track
>       >       when it
>       >          detects a loss of a retransmitted segment. Current
>       >       algorithms
>       >          in the draft detect a loss of retransmitted segment
>       >       exactly in
>       >          the same way as loss of any other segment. There
>       seems
>       >       to be
>       >          nothing to track when a retransmission of a
>       >       retransmitted segment
>       >          takes place. Therefore, the algorithms should have
>       >       additional
>       >          actions to correctly track when such a loss is
>       detected.
>       >
>       >       - For (1), discussion on how many times a loss of a
>       >       retransmission
>       >          of the same segment may occur and be detected. Seems
>       >       that it
>       >          may be possible to drop a rexmitted segment more than
>       >       once and
>       >          detect it also several times?  What are the
>       >       implications?
>       >
>       >       - If previous is possible, then the algorithm possibly
>       also
>       >          may detect a loss of a new segment that was sent
>       during
>       >       fast
>       >          recovery? This is also loss in two successive windows
>       of
>       >       data,
>       >          and cwnd MUST be lowered twice. This discussion and
>       >       necessary
>       >          actions to track it are missing, if such scenario is
>       >       possible.
>       >
>       >       > What am I missing?
>       >
>       >       Hope the above helps.
>       >
>       >       /Markku
>       >
>       >
>       > <snipping the rest>
>       >
>       >
>
>
>

[tcpm] Last Call: <draft-ietf-tcpm-rack-13.txt> (… The IESG
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Yuchung Cheng
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Ian Swett
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] [Last-Call] Last Call: <draft-ietf-tcp… Michael Welzl
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Yuchung Cheng
Re: [tcpm] [Last-Call] Last Call: <draft-ietf-tcp… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Neal Cardwell
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Praveen Balasubramanian
Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Yuchung Cheng
Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Martin Duke
Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Yuchung Cheng
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Neal Cardwell
Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Neal Cardwell
Re: [tcpm] [EXTERNAL] Re: Last Call:<draft-ietf-t… Markku Kojo
Re: [tcpm] [EXTERNAL] Re: Last Call:<draft-ietf-t… Markku Kojo
Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Praveen Balasubramanian
Re: [tcpm] [EXTERNAL] Re: Last Call:<draft-ietf-t… Markku Kojo