Re: [tcpm] A question about PTO/RTO rescheduling in RACK draft

hiren panchasara <hiren@strugglingcoder.info> Wed, 17 April 2019 18:13 UTC

Return-Path: <hiren@strugglingcoder.info>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8A3F1120006 for <tcpm@ietfa.amsl.com>; Wed, 17 Apr 2019 11:13:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.107
X-Spam-Level:
X-Spam-Status: No, score=-1.107 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RDNS_NONE=0.793] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w0n7Mncy_DbT for <tcpm@ietfa.amsl.com>; Wed, 17 Apr 2019 11:13:44 -0700 (PDT)
Received: from mail.strugglingcoder.info (unknown [104.236.146.68]) by ietfa.amsl.com (Postfix) with ESMTP id B98691200B1 for <tcpm@ietf.org>; Wed, 17 Apr 2019 11:13:44 -0700 (PDT)
Received: from localhost (unknown [10.1.1.3]) (Authenticated sender: hiren@strugglingcoder.info) by mail.strugglingcoder.info (Postfix) with ESMTPA id 66CAF17F67; Wed, 17 Apr 2019 11:13:44 -0700 (PDT)
Date: Wed, 17 Apr 2019 11:13:44 -0700
From: hiren panchasara <hiren@strugglingcoder.info>
To: Yi Huang <huanyi=40microsoft.com@dmarc.ietf.org>
Cc: "tcpm@ietf.org" <tcpm@ietf.org>, Matt Olson <maolson@microsoft.com>
Message-ID: <20190417181344.GM31257@strugglingcoder.info>
References: <BL0PR2101MB104347EF7FC7CD5C86DF08B7C3250@BL0PR2101MB1043.namprd21.prod.outlook.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="m46qSNjkc66Ye11q"
Content-Disposition: inline
In-Reply-To: <BL0PR2101MB104347EF7FC7CD5C86DF08B7C3250@BL0PR2101MB1043.namprd21.prod.outlook.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/GGcjfueF8kc_8VY4JkrShalxRzI>
Subject: Re: [tcpm] A question about PTO/RTO rescheduling in RACK draft
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Apr 2019 18:13:46 -0000

On 04/17/19 at 01:48P, Yi Huang wrote:
> Hi,

Hi!
I don't see anything inherently wrong in how this works or I am missing
something.
> 
> I have a question about PTO/RTO rescheduling logic in RACK draft. The draft states, in Section 6.5.1, "If there was a previously scheduled PTO or RTO pending, then that pending PTO or RTO should first be cancelled, and then the new PTO should be scheduled". So if an app keeps posting data in an interval less than PTO and RTO (say one byte each time and Cwnd is large enough) and no acks are coming back due to network failures or a receiver that just does not send anything back, the app will never fires RTO since the timer is pushed out each time we send this one byte data. Is this a problem?
By growing CWND, TCP has decided that it is okay to send that much
without needing an ACK.
I think something along the lines of detecting that connection is
app-limited OR doing newcwv like validation may help to keep CWND in
check?

> What is the intention of pushing out the timer for each send?
6.5.1 (Step 1) indicated a few conditions for scheduling PTO (i.e.
pushing out the timer) so its not blindly doing for each send. But it
would do that for the case you described as app *is* handing new data to
be sent to TCP.

>I think in this case the sender should disconnect the connection because, without TLP, the sender would have had several RTOs and closed the connection.
I am probably not understanding something. I think you are saying TLP is
causing RTO to be pushed out unnecessarily.

One cannot schedule PTO when a TLP is out. So, on first timeout i.e. PTO
as we prefer that over RTO, you send a TLP out. And if you don't get an
ACK back for that, next timeout would be an RTO as 6.5.1.  Phase 1
condition 1 suggests.

But if there is new data to be sent, RTO would also gets pushed out,
no?
> 
> Also, RFC 6298 Section 5 (5.1) states "Every time a packet containing data is sent (including a retransmission), if the timer is not running, start it running so that it will expire after RTO seconds (for the current value of RTO).", which says TCP should schedule timer only when it is not schedule yet and this kind of conflicts with (or is overridden by) what is in RACK/TLP draft.

I am not sure I follow. For the connection, you need a timeout
mechanism. And RACK draft suggests keeping PTO aligned with RTO as the
former is less expensive. Try PTO and if it doesn't work, do RTO.

I believe the wordings around "first cancel the pending one and then
only schedule new one" are for clarity around implementation details.
Gist is to not keep 2 sets of timers running for the same timeout
mechanism. 

Someone with more clue on the list would probably correct me. :-)
Cheers,
Hiren