Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.txt>(TheRACK-TLPlossdetectionalgorithmforTCP)to Proposed Standard

Markku Kojo <kojo@cs.helsinki.fi> Thu, 17 December 2020 08:56 UTC

Return-Path: <kojo@cs.helsinki.fi>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BAC173A156A; Thu, 17 Dec 2020 00:56:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cs.helsinki.fi
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jmuiGzTsNuXQ; Thu, 17 Dec 2020 00:56:42 -0800 (PST)
Received: from script.cs.helsinki.fi (script.cs.helsinki.fi [128.214.11.1]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E94383A1566; Thu, 17 Dec 2020 00:56:41 -0800 (PST)
X-DKIM: Courier DKIM Filter v0.50+pk-2017-10-25 mail.cs.helsinki.fi Thu, 17 Dec 2020 10:56:39 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.helsinki.fi; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; s=dkim20130528; bh=Cvu0QyNH5pME8NRUW 5p3RESfsVWuG+oDpOmqmLttJCk=; b=VHwHRdc0+4DSSzhjXAJIGdR6zJckQAIMG xuOfEiXefL6N3bu4vvzgnGN9D5ogcfznvAfV5ZaEh7PCTqjohNVfjTDqDOZAtpIb F/a2WrAYtrJGrz4H20kf/Ymq8gE+G1Av971/d9KDUL0FPJ+izCHzW6x9cis7lEaC ot/JS5enck=
Received: from hp8x-60 (88-113-50-238.elisa-laajakaista.fi [88.113.50.238]) (AUTH: PLAIN kojo, TLS: TLSv1/SSLv3,256bits,AES256-GCM-SHA384) by mail.cs.helsinki.fi with ESMTPSA; Thu, 17 Dec 2020 10:56:38 +0200 id 00000000005A0932.000000005FDB1D46.00004D15
Date: Thu, 17 Dec 2020 10:56:38 +0200
From: Markku Kojo <kojo@cs.helsinki.fi>
To: Martin Duke <martin.h.duke@gmail.com>
cc: Neal Cardwell <ncardwell@google.com>, Yuchung Cheng <ycheng@google.com>, Last Call <last-call@ietf.org>, "tcpm@ietf.org Extensions" <tcpm@ietf.org>, draft-ietf-tcpm-rack@ietf.org, Michael Tuexen <tuexen@fh-muenster.de>, draft-ietf-tcpm-rack.all@ietf.org, tcpm-chairs <tcpm-chairs@ietf.org>
In-Reply-To: <alpine.DEB.2.21.2012170946280.5844@hp8x-60.cs.helsinki.fi>
Message-ID: <alpine.DEB.2.21.2012171051110.5844@hp8x-60.cs.helsinki.fi>
References: <160557473030.20071.3820294165818082636@ietfa.amsl.com> <alpine.DEB.2.21.2012030145440.5180@hp8x-60.cs.helsinki.fi> <CAK6E8=diHBZJC5Ei=wKt=j=om1aDcFU8==kSYEtp=KZ4g__+Xg@mail.gmail.com> <alpine.DEB.2.21.2012071227390.5180@hp8x-60.cs.helsinki.fi> <CAK6E8=fNd3ToWEoCYHwgPG7QUvCXw3kV2rwH=hqmhibQmQNseg@mail.gmail.com> <alpine.DEB.2.21.2012081502530.5180@hp8x-60.cs.helsinki.fi> <CADVnQykrm1ORm7N+8L0iEyqtJ2rQ1dr1xg+EmYcWQE9nmDX_mA@mail.gmail.com> <alpine.DEB.2.21.2012141505360.5844@hp8x-60.cs.helsinki.fi> <CAM4esxT9hNqX4Zo+9tMRu9MNEfwuUwebaBFcitj1pCZx_NkqHA@mail.gmail.com> <alpine.DEB.2.21.2012160256380.5844@hp8x-60.cs.helsinki.fi> <CAM4esxRDrFZAYBS4exaQFFj6Djwe6KHrzMEtGvOhscpoxk3RQA@mail.gmail.com> <alpine.DEB.2.21.2012162339560.5844@hp8x-60.cs.helsinki.fi> <CAM4esxRQjuzo4u9oUN2CDC1vbeFxmSarjBLqpboatjWouiL37Q@mail.gmail.com> <alpine.DEB.2.21.2012170946280.5844@hp8x-60.cs.helsinki.fi>
User-Agent: Alpine 2.21 (DEB 202 2017-01-01)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=_script-19759-1608195399-0001-2"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/SjLJyiC6OLUi37VQ0DCuAOcKA0U>
Subject: Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.txt>(TheRACK-TLPlossdetectionalgorithmforTCP)to Proposed Standard
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Dec 2020 08:56:45 -0000

Hi,

Just one clarification inline.

On Thu, 17 Dec 2020, Markku Kojo wrote:

> Hi Martin,
>
> On Wed, 16 Dec 2020, Martin Duke wrote:
>
>> (1) Flightsize: in RFC 6675. Section 5, Step 4.2:
>>
>>        (4.2) ssthresh = cwnd = (FlightSize / 2)
>>
>>              The congestion window (cwnd) and slow start threshold
>>              (ssthresh) are reduced to half of FlightSize per [RFC5681].
>>              Additionally, note that [RFC5681] requires that any
>>              segments sent as part of the Limited Transmit mechanism not
>>              be counted in FlightSize for the purpose of the above
>>              equation.
>> 
>> IIUC the segments P21..P29 in your example were sent because of Limited 
>> Transmit,
>> and so don't count. The flightsize for the purposes of (4.2) is therefore 
>> 20 after
>> both losses, and the cwnd does not go up on the second loss.
>
> Incorrect. (And you corrected this in your later reply)
>
>> (2)
>> " Even a single shot burst every time there is significant loss
>> event is not acceptable, not to mention continuous aggressiveness, and
>> this is exactly what RFC 2914 and RFC 5033 explicitly address and warn
>> about."
>> 
>> "Significant loss event" is the key phrase here. The intent of TLP/PTO is 
>> to
>> equalize the treatment of a small packet loss whether it happened in the 
>> middle of
>> a burst or the end. Why should an isolated loss be treated differently 
>> based on
>> its position in the burst? This is just a logical extension of fast 
>> retransmit,
>> which also modified the RTO paradigm. The working group consensus is that 
>> this is
>> a feature, not a bug; you're welcome to feel otherwise but I suspect you're 
>> in the
>> rough here.
>
> An isolated loss in the end of flight and repairing it is handled as a 
> separate case in the draft. It is easy to have a less conservative congestion 
> control response for it, and I would be fine with that.
>
> The problem we are discussing is the "significant loss event" where even a 
> huge number of packets are lost, potentially indicating severe congestion. 
> The draft accidentally handles it the same as a small packet loss. And that 
> is wrong.
>
> Why draft handles it accidentally wrong: because RACK-TLP has never been 
> implemented with the standard congestion control (RFC 6675) but with PRR and 
> the experimental evidence presented to the working group was based on the 
> implementation with PRR. One may say that the WG was *accidentally* 
> misguided.
>
> And as I have stated many times: PRR modifies RFC 6675 congestion control and 
> includes necessary actions to handle a significant loss event properly. But 
> current standards track congestion control does not.

To clarify: current standards track congestion control does not 
handle such a significant loss event properly *if* it is detected by TLP 
and would initiate fast recovery as specified in this draft.

/Markku

> And this draft 
> explicitly modifies standards track congestion control but conceals this 
> significant modification in Sec. 9.3 when discussing the interaction with 
> congestion control.
>
> Best regards,
>
> /Markku
>
>> Regards
>> Martin
>> 
>> 
>> On Wed, Dec 16, 2020 at 4:11 PM Markku Kojo <kojo@cs.helsinki.fi> wrote:
>>       Hi Martin,
>>
>>       See inline.
>>
>>       On Wed, 16 Dec 2020, Martin Duke wrote:
>>
>>       > Hi Markku,
>>       >
>>       > There is a ton here, but I'll try to address the top points.
>>       Hopefully
>>       > they obviate the rest.
>>
>>       Sorry for being verbose. I tried to be clear but you actually removed
>>       my
>>       key issues/questions ;)
>>
>>       > 1.
>>       > [Markku]
>>       > "Hmm, not sure what you mean by "this is a new loss detection after
>>       > acknowledgment of new data"?
>>       > But anyway, RFC 5681 gives the general principle to reduce cwnd and
>>       > ssthresh twice if a retransmission is lost but IMHO (and I believe
>>       many
>>       > who have designed new loss recovery and CC algorithms or 
>> implemented
>>       > them
>>       > agree) that it is hard to get things right if only congestion
>>       control
>>       > principles are available and no algorithm."
>>       >
>>       > [Martin]
>>       > So 6675 Sec 5 is quite explicit that there is only one cwnd
>>       reduction
>>       > per fast recovery episode, which ends once new data has been
>>       > acknowledged.
>>
>>       To be more precise: fast recovery ends when the current window 
>> becomes
>>       cumulatively acknowledged, that is,
>>
>>       (4.1) RecoveryPoint (= HighData at the beginning) becomes 
>> acknowledged
>>
>>       I believe we agree and you meant this although new data below
>>       RecoveryPoint may become cumulatively acknowledged already earlier
>>       during the fast recovery. Reno loss recovery in RFC 5681 ends, when
>>       (any) new data has been acknowledged.
>>
>>       > By definition, if a retransmission is lost it is because
>>       > newer data has been acknowledged, so it's a new recovery episode.
>>
>>       Not sure where you have this definition? Newer than what are you
>>       referring to?
>>
>>       But, yes, if a retransmission is lost with RFC 6675 algorithm,
>>       it requires RTO to be detected and definitely starts a new recovery
>>       episode. That is, a new recovery episode is enforced by step (1.a) of
>>       NextSeg () which prevents retransmission if a segment that has 
>> already
>>       been retransmitted. If RACK-TLP is used for detecting loss with RFC
>>       6675
>>       things get different in many ways, because it may detect loss of a
>>       retransmission. It would pretty much require an entire redesign
>>       of the algorith. For example, calculation of pipe does not consider
>>       segments that have been retransmitted more than once.
>>
>>       > Meanwhile, during the Fast Recovery period the incoming acks
>>       implicitly
>>       > remove data from the network and therefore keep flightsize low.
>>
>>       Incorrect. FlightSize != pipe. Only cumulative acks remove data from
>>       FlightSize and new data transmitted during fast recovery inflate
>>       FlightSize. How FlightSize evolves depends on loss pattern as I said.
>>       It is also possible that FlightSize is low, it may err in both
>>       directions. A simple example can be used as a proof for the case 
>> where
>>       cwnd increases if a loss of retransmission is detected and repaired:
>>
>>       RFC 6675 recovery with RACK-TLP loss detection:
>>       (contains some inaccuracies because it has not been defined how
>>       lost rexmits are calculated into pipe)
>>
>>       cwnd=20; packets P1,...,P20 in flight = current window of data
>>       [P1 dropped and rexmit of P1 will also be dropped]
>>
>>       DupAck w/SACK for P2 arrives
>>       [loss of P1 detected after one RTT from original xmit of P1]
>>       [cwnd=ssthresh=10]
>>       P1 is rexmitted (and it logically starts next window of data)
>>
>>       DupAcks w/ SACK for original P3..11 arrive
>>       DupAck w/ SACK for original P12 arrives
>>       [cwnd-pipe = 10-9 >=1]
>>       send P21
>>       DupAck w/SACK for P13 arrives
>>       send P22
>>       ...
>>       DupAck w/SACK for P20 arrives
>>       send P29
>>       [FlightSize=29]
>>
>>       (Ack for rexmit of P1 would arrive here unless it got dropped)
>>
>>       DupAck w/SACK for P21 arrives
>>       [loss of rexmit P1 detected after one RTT from rexmit of P1]
>>
>>       SET cwnd = ssthresh = FlightSize/2= 29/2 = 14,5
>>
>>       CWND INCREASES when it should be at most 5 after halving it twice!!!
>>
>>       > We can continue to go around on our interpretation of these
>>       documents,
>>       > but fundamentally if there is ambiguity in 5681/6675 we should bis
>>       > those RFCs rather than expand the scope of RACK.
>>
>>       As I said earlier, I am not opposing bis, though 5681bis wuold not
>>       be needed, I think.
>>
>>       But let me repeat: if we publish RACK-TLP now without necessary
>>       warnings
>>       or with a correct congesion control algorithm someone will try to
>>       implement RACK-TLP with RFC 6675 and it will be a total mesh. The
>>       behavior will be unpredictable and quite likely unsafe congestion
>>       control behavior.
>>
>>       > 2.
>>       > [Markku]
>>       > " In short:
>>       > When with a non-RACK-TLP implementation timer (RTO) expires: cwnd=1
>>       > MSS,
>>       > and slow start is entered.
>>       > When with a RACK_TLP implementation timer (PTO) expires,
>>       > normal fast recovery is entered (unless implementing
>>       > also PRR). So no RTO recovery as explicitly stated in Sec. 7.4.1."
>>       >
>>       > [Martin]
>>       > There may be a misunderstanding here. PTO is not the same as RTO,
>>       and
>>       > both mechanisms exist! The loss response to a PTO is to send a
>>       probe;
>>       > the RTO response is as with conventional TCP. In Section 7.3:
>>
>>       No, I don't think I misunderstood. If you call timeout with
>>       another name, it is still timeout. And congestion control does not
>>       consider which segments to send (SND.UNA vs. probe w/ higher sequence
>>       number), only how much is sent.
>>
>>       You ignored my major point where I decoupled congestion control from
>>       loss
>>       detection and loss recovery and compared RFC 5681 behavior to 
>> RACK-TLP
>>       behavior in exactly the same scenario where an entire flight is lost
>>       and
>>       timer expires.
>>
>>       Please comment why congestion control behavior is allowed to be
>>       radically
>>       different in these two implementations?
>>
>>       RFC 5681 & RFC 6298 timeout:
>>
>>               RTO=SRTT+4*RTTVAR (RTO used for arming the timer)
>>              1. RTO timer expires
>>              2. cwnd=1 MSS; ssthresh=FlightSize/2; rexmit one segment
>>              3. Ack of rexmit sent in step 2 arrives
>>              4. cwnd = cwnd+1 MSS; send two segments
>>              ...
>>
>>       RACK-TLP timeout:
>>
>>               PTO=min(2*SRTT,RTO) (PTO used for arming the timer)
>>              1. PTO times expires
>>              2. (cwnd=1 MSS); (re)xmit one segment
>>              3. Ack of (re)xmit sent in srep 2 arrives
>>              4. cwnd = ssthresh = FlightSize/2; send N=cwnd segments
>>
>>       If FlightSize is 100 segments when timer expires, congestion control
>>       is
>>       the same in steps 1-3, but in step 4 the standard congestion control
>>       allows transmitting 2 segments, while RACK-TLP would allow
>>       blasting 50 segments.
>>
>>       > After attempting to send a loss probe, regardless of whether a loss
>>       >    probe was sent, the sender MUST re-arm the RTO timer, not the 
>> PTO
>>       >    timer, if FlightSize is not zero.  This ensures RTO recovery
>>       remains
>>       >    the last resort if TLP fails.
>>       > "
>>
>>       This does not prevent the above RACK-TLP behavior from getting
>>       realized.
>>
>>       > So a pure RTO response exists in the case of persistent congestion
>>       that
>>       > causes losses of probes or their ACKs.
>>
>>       Yes, RTO response exists BUT only after RACK-TLP at least once blasts
>>       the
>>       network. It may well be that with smaller windows RACK-TLP is
>>       successful
>>       during its TLP initiated overly aggressive "fast recovery" and never
>>       enters RTO recovery because it may detect and repair also loss of
>>       rexmits. That is, it continues at too high rate even if lost rexmits
>>       indicate that congestion persists in successive windows of data. And
>>       worse, it is successful because it pushes away other compatible TCP
>>       flows by being too aggressive and unfair.
>>
>>       Even a single shot burst every time there is significant loss
>>       event is not acceptable, not to mention continuous aggressiveness, 
>> and
>>       this is exactly what RFC 2914 and RFC 5033 explicitly address and 
>> warn
>>       about.
>>
>>       Are we ignoring these BCPs that have IETF consensus?
>>
>>       And the other important question I'd like to have an answer:
>>
>>       What is the justification to modify standard TCP congestion control 
>> to
>>       use fast recovery instead of slow start for a case where timeout is
>>       needed to detect the packet losses because there is no feedback and
>>       ack
>>       clock is lost? RACK-TLP explicitly instructs to do so in Sec. 7.4.1.
>>
>>       As I noted: based on what is written in the draft it does not intend
>>       to
>>       change congestion control but effectively it does.
>>
>>       /Markku
>>
>>       > Martin
>>       >
>>       >
>>       > On Wed, Dec 16, 2020 at 11:39 AM Markku Kojo <kojo@cs.helsinki.fi>
>>       > wrote:
>>       >       Hi Martin,
>>       >
>>       >       On Tue, 15 Dec 2020, Martin Duke wrote:
>>       >
>>       >       > Hi Markku,
>>       >       >
>>       >       > Thanks for the comments. The authors will incorporate
>>       >       many of your
>>       >       > suggestions after the IESG review.
>>       >       >
>>       >       > There's one thing I don't understand in your comments:
>>       >       >
>>       >       > " That is,
>>       >       > where can an implementer find advice for correct
>>       >       congestion control
>>       >       > actions with RACK-TLP, when:
>>       >       >
>>       >       > (1) a loss of rexmitted segment is detected
>>       >       > (2) an entire flight of data gets dropped (and detected),
>>       >       >      that is, when there is no feedback available and a
>>       >       timeout
>>       >       >      is needed to detect the loss "
>>       >       >
>>       >       > Section 9.3 is the discussion about CC, and is clear that
>>       >       the
>>       >       > implementer should use either 5681 or 6937.
>>       >
>>       >       Just a cite nit: RFC 5681 provides basic CC concepts and
>>       >       some useful CC
>>       >       guidelines but given that RACK-TLP MUST implement SACK the
>>       >       algorithm in
>>       >       RFC 5681 is not that useful and an implementer quite likely
>>       >       follows
>>       >       mainly the algorithm in RFC 6675 (and not RFC 6937 at all
>>       >       if not
>>       >       implementing PRR).
>>       >       And RFC 6675 is not mentioned in Sec 9.3, though it is
>>       >       listed in the
>>       >       Sec. 4 (Requirements).
>>       >
>>       >       > You went through the 6937 case in detail.
>>       >
>>       >       Yes, but without correct CC actions.
>>       >
>>       >       > If 5681, it's pretty clear to me that in (1) this is a
>>       >       new loss
>>       >       > detection after acknowledgment of new data, and therefore
>>       >       requires a
>>       >       > second halving of cwnd.
>>       >
>>       >       Hmm, not sure what you mean by "this is a new loss
>>       >       detection after
>>       >       acknowledgment of new data"?
>>       >       But anyway, RFC 5681 gives the general principle to reduce
>>       >       cwnd and
>>       >       ssthresh twice if a retransmission is lost but IMHO (and I
>>       >       believe many
>>       >       who have designed new loss recovery and CC algorithms or
>>       >       implemented them
>>       >       agree) that it is hard to get things right if only
>>       >       congestion control
>>       >       principles are available and no algorithm.
>>       >       That's why ALL mechanisms that we have include a quite
>>       >       detailed algorithm
>>       >       with all necessary variables and actions for loss recovery
>>       >       and/or CC
>>       >       purposes (and often also pseudocode). Like this document
>>       >       does for loss
>>       >       detection.
>>       >
>>       >       So the problem is that we do not have a detailed enough
>>       >       algorithm or
>>       >       rule that tells exactly what to do when a loss of rexmit is
>>       >       detected.
>>       >       Even worse, the algorithms in RFC 5681 and RFC 6675 refer
>>       >       to
>>       >       equation (4) of RFC 5681 to reduce ssthresh and cwnd when a
>>       >       loss
>>       >       requiring a congestion control action is detected:
>>       >
>>       >         (cwnd =) ssthresh = FlightSize / 2)
>>       >
>>       >       And RFC 5681 gives a warning not to halve cwnd in the
>>       >       equation but
>>       >       FlightSize.
>>       >
>>       >       That is, this equation is what an implementer intuitively
>>       >       would use
>>       >       when reading the relevant RFCs but it gives a wrong result
>>       >       for
>>       >       outstanding data when in fast recovery (when the sender is
>>       >       in
>>       >       congestion avoidance and the equation (4) is used to halve
>>       >       cwnd, it
>>       >       gives a correct result).
>>       >       More precisely, during fast recovery FlightSize is inflated
>>       >       when new
>>       >       data is sent and reduced when segments are cumulatively
>>       >       Acked.
>>       >       What the outcome is depends on the loss pattern. In the
>>       >       worst case,
>>       >       FlightSize is signficantly larger than in the beginning of
>>       >       the fast
>>       >       recovery when FlightSize was (correctly) used to determine
>>       >       the halved
>>       >       value for cwnd and ssthresh, i.e., equation (4) may result
>>       >       in
>>       >       *increasing* cwnd upon detecting a loss of a rexmitted
>>       >       segment, instead
>>       >       of further halving it.
>>       >
>>       >       A clever implementer might have no problem to have it right
>>       >       with some
>>       >       thinking but I am afraid that there will be incorrect
>>       >       implementations
>>       >       with what is currently specified. Not all implementers have
>>       >       spent
>>       >       signicicant fraction of their career in solving TCP
>>       >       peculiarities.
>>       >
>>       >       > For (2), the RTO timer is still operative so
>>       >       > the RTO recovery rules would still follow.
>>       >
>>       >       In short:
>>       >       When with a non-RACK-TLP implementation timer (RTO)
>>       >       expires: cwnd=1 MSS,
>>       >       and slow start is entered.
>>       >       When with a RACK_TLP implementation timer (PTO) expires,
>>       >       normal fast recovery is entered (unless implementing
>>       >       also PRR). So no RTO recovery as explicitly stated in Sec.
>>       >       7.4.1.
>>       >
>>       >       This means that this document explicitly modifies standard
>>       >       TCP congestion
>>       >       control when there are no acks coming and the
>>       >       retransmission timer
>>       >       expires
>>       >
>>       >       from: RTO=SRTT+4*RTTVAR (RTO used for arming the timer)
>>       >              1. RTO timer expires
>>       >              2. cwnd=1 MSS; ssthresh=FlightSize/2; rexmit one
>>       >       segment
>>       >              3. Ack of rexmit sent in step 2 arrives
>>       >              4. cwnd = cwnd+1 MSS; send two segments
>>       >              ...
>>       >
>>       >       to:   PTO=min(2*SRTT,RTO) (PRO used for arming the timer)
>>       >              1. PTO times expires
>>       >              2. (cwnd=1 MSS); (re)xmit one segment
>>       >              3. Ack of (re)xmit sent in srep 2 arrives
>>       >              4. cwnd = ssthresh = FlightSize/2; send N=cwnd
>>       >       segments
>>       >
>>       >       For example, if FlightSize is 100 segments when timer
>>       >       expires,
>>       >       congestion control is the same in steps 1-3, but in step 4
>>       >       the
>>       >       current standard congestion control allows transmitting 2
>>       >       segments,
>>       >       while RACK-TLP would allow blasting 50 segments.
>>       >
>>       >       Question is: what is the justification to modify standard
>>       >       TCP
>>       >       congestion control to use fast recovery instead of slow
>>       >       start for a
>>       >       case where timeout is needed to detect loss because there
>>       >       is no
>>       >       feedback and ack clock is lost? The draft does not give any
>>       >       justification. This clearly is in conflict with items (0)
>>       >       and (1)
>>       >       in BCP 133 (RFC 5033).
>>       >
>>       >       Furthermore, there is no implementation nor experimental
>>       >       experience
>>       >       evaluating this change. The implementation with
>>       >       experimental experience
>>       >       uses PRR (RFC 6937) which is an Experimental specification
>>       >       including a
>>       >       novel "trick" that directs PRR fast recovery to effectively
>>       >       use slow
>>       >       start in this case at hand.
>>       >
>>       >
>>       >       > In other words, I am not seeing a case that requires new
>>       >       congestion
>>       >       > control concepts except as discussed in 9.3.
>>       >
>>       >       See above. The change in standard congestion control for
>>       >       (2).
>>       >       The draft intends not to change congestion control but
>>       >       effectively it
>>       >       does without any operational evidence.
>>       >
>>       >       What's also is missing and would be very useful:
>>       >
>>       >       - For (1), a hint for an implementer saying that because
>>       >       RACK-TLP is
>>       >          able to detect a loss of a rexmit unlike any other loss
>>       >       detection
>>       >          algorithm, the sender MUST react twice to congestion
>>       >       (and cite
>>       >          RFC 5681). And cite a document where necessary correct
>>       >       actions
>>       >          are described.
>>       >
>>       >       - For (1), advise that an implementer needs to keep track
>>       >       when it
>>       >          detects a loss of a retransmitted segment. Current
>>       >       algorithms
>>       >          in the draft detect a loss of retransmitted segment
>>       >       exactly in
>>       >          the same way as loss of any other segment. There seems
>>       >       to be
>>       >          nothing to track when a retransmission of a
>>       >       retransmitted segment
>>       >          takes place. Therefore, the algorithms should have
>>       >       additional
>>       >          actions to correctly track when such a loss is detected.
>>       >
>>       >       - For (1), discussion on how many times a loss of a
>>       >       retransmission
>>       >          of the same segment may occur and be detected. Seems
>>       >       that it
>>       >          may be possible to drop a rexmitted segment more than
>>       >       once and
>>       >          detect it also several times?  What are the
>>       >       implications?
>>       >
>>       >       - If previous is possible, then the algorithm possibly also
>>       >          may detect a loss of a new segment that was sent during
>>       >       fast
>>       >          recovery? This is also loss in two successive windows of
>>       >       data,
>>       >          and cwnd MUST be lowered twice. This discussion and
>>       >       necessary
>>       >          actions to track it are missing, if such scenario is
>>       >       possible.
>>       >
>>       >       > What am I missing?
>>       >
>>       >       Hope the above helps.
>>       >
>>       >       /Markku
>>       >
>>       >
>>       > <snipping the rest>
>>       >
>>       >
>> 
>> 
>