Re: [tcpm] Questions about TLP

Yuchung Cheng <ycheng@google.com> Thu, 09 May 2019 01:29 UTC

Return-Path: <ycheng@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 08724120203 for <tcpm@ietfa.amsl.com>; Wed, 8 May 2019 18:29:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -13.899
X-Spam-Level:
X-Spam-Status: No, score=-13.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=1.989, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, URI_HEX=1.122, URI_NOVOWEL=0.5, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PfhWG_4F6Bxy for <tcpm@ietfa.amsl.com>; Wed, 8 May 2019 18:29:23 -0700 (PDT)
Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 320E51200C7 for <tcpm@ietf.org>; Wed, 8 May 2019 18:29:23 -0700 (PDT)
Received: by mail-wr1-x433.google.com with SMTP id f7so520530wrq.1 for <tcpm@ietf.org>; Wed, 08 May 2019 18:29:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Z4wp3kC/GR+kDlj76J5cdL2+aOICXinHn6DwsUxIapU=; b=S38YDEQF7+d1ywJZj5A3qDR9gwDpi6TEzNFPiU7KapQEuJjOpFXxaZVt0zb9DP/GMc evy+Xbrd2Zy/PUiMjdtHG7a2qYfUb0cU/Y1gijxwtM7nUduRjmT/ikLVk4RxmLBbb9b6 /01LYJYlgm0NYbHIrahKfgTrnvlJh2J7q2z0ibMQHP823lmYT2U1anaq6gxIdgViYla0 sOGlSDQ3UJye2NYv7v03t1Jg7eUlmta5CQH1PIDd/K6Fuh1dORTBsrqK2edGjthwXOx9 e6pTbEFyCV39nd9Bb/N9fDCcwW1OeK8OegzAcTHJRtNG+Zw17O892ldZn2D7R+0gOiTW fr5w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Z4wp3kC/GR+kDlj76J5cdL2+aOICXinHn6DwsUxIapU=; b=qKkMtYYCIqHV9ixrKUBZPdZ7qbKn4izZV/owGxZMCoZnY0yoqNFERNtWKCPgg/+7If Io3aAI98HtHkiysbuLHYGRPZtqocYnA65JWo+aM1+Xfy7mzYnkxpfIVoVAL8LSm0oSge Rn12TqaGd3AUJcRLw8AE1b7PPsHdeaHKMF6z9YZhjc5PpwtvOZrBwa+wMa/JFtUHjhT6 nYQ1YojwEnnYIkwPEFe7YTUMM2Qv6VNptqxrG1kBWmsy/1fbfAFEOrJK4bAuvOqO1zbo TAJTihBq4P7484ZA/sd2DOyrlgeZ4aW3KyhTGhrOHv4HtF0XGorfxFdzAUOuBcRd1Vw/ uolA==
X-Gm-Message-State: APjAAAU8igbax5LU4RtQbPvKMvtqhuNizzmGHUSYiLDHcD2UCBCGhegI J4Vg00NfpkltagXQ6mwm9IyUp6TotjoOuYPKe1ZlEg==
X-Google-Smtp-Source: APXvYqwo3aCELFMDAPUzqImpROAgOsWCp7P7A1uIfAZOc9h+B51d41MHpGPX5Ecy97DvQNc3PVoZe7Ctl2u6HclqTlg=
X-Received: by 2002:adf:f304:: with SMTP id i4mr622112wro.97.1557365361081; Wed, 08 May 2019 18:29:21 -0700 (PDT)
MIME-Version: 1.0
References: <BL0PR2101MB1043C5ABE55E48572EB20C62C3340@BL0PR2101MB1043.namprd21.prod.outlook.com> <CAK6E8=fFR_VT8wMCzUW288HrN91NrbryerVLjOH5=6=bCEJLjw@mail.gmail.com> <BYAPR21MB125645C6DFCD605AE73E97FEBC340@BYAPR21MB1256.namprd21.prod.outlook.com> <CAK6E8=d06pqD=VY1t4rKeLTpcrAGaNCYJgjkLWTH0fhxbsML9w@mail.gmail.com> <BL0PR2101MB1043868EC0F299BFDED33A49C3340@BL0PR2101MB1043.namprd21.prod.outlook.com> <CAK6E8=cUCx9E21d5ig9cuFKgNY_taEWEdu9EK-OV_gaAWqUxaQ@mail.gmail.com> <BL0PR2101MB104398C390A677A41F52E075C3330@BL0PR2101MB1043.namprd21.prod.outlook.com>
In-Reply-To: <BL0PR2101MB104398C390A677A41F52E075C3330@BL0PR2101MB1043.namprd21.prod.outlook.com>
From: Yuchung Cheng <ycheng@google.com>
Date: Wed, 08 May 2019 18:28:43 -0700
Message-ID: <CAK6E8=ej=XSX5Fazzru79SqcrsR_g=5xE9qzELdzZ9TTVahRDQ@mail.gmail.com>
To: Yi Huang <huanyi@microsoft.com>
Cc: Yuchung Cheng <ycheng=40google.com@dmarc.ietf.org>, Matt Olson <maolson@microsoft.com>, "tcpm@ietf.org" <tcpm@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000002270bf05886a61ef"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/79V6P3WE3auhF572eNoZzGX5YY0>
Subject: Re: [tcpm] Questions about TLP
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 May 2019 01:29:26 -0000

*From: *Yi Huang <huanyi@microsoft.com>
*Date: *Wed, May 8, 2019 at 5:19 PM
*To: *Yuchung Cheng
*Cc: *Yuchung Cheng, Matt Olson, tcpm@ietf.org

Hi Yuchung,
>
>
>
> I have one more question regarding to TLP and the proposed rev. So during
> a post-TLP RTO, if the application writes something before this RTO fires,
> the sender will still try to schedule PTO, which should cancel the already
> running RTO. When this new PTO fires, since TLPRxtOut is true due the
> previous TLP sent, the sender will not send another TLP but rearm RTO
> timer. Is this true?
>
yes


> If so and an app always writes data in such a pattern and no ACKs are
> coming back due to network failure, RTO will be postponed infinitely until
> the window limit is reached? My main concern is that we could have detected
> the network failure very early. It seems like the current algorithm might
> cause delays in tearing down the connection depending on app send behavior.
>
>
>
> A sequence to help illustrate the problem I mentioned and assume no ACKs
> will be received due to network failure during so.
>
> App writes->arm PTO->PTO fires->TLP->*arm RTO*->app writes->arm PTO->PTO
> fires->*arm RTO*->app writes->arm PTO->PTO fires->*arm RTO*…
>
Good question. In the second "arm PTO" of your example, the last two lines
in TLP_timeout() pseudo code may set PTO timer to expire at the original
RTO (TCP_RTO_expire()). The timer expiration won't be extended forever but
rather set to expire at max(PTO, original_RTO) of the first app writes.


>
> Also, the proposed rev says “*Also checking TLPRxtOut prior to sending
> the loss probe* is important to avoid TLP loops if an application writes
> periodically at an interval less than PTO.”. If an app writes periodically
> at an interval less than PTO, the sender will just periodically postpone
> PTO until window limit is reached since each time the app writes data, a
> new PTO will be scheduled and cancel the existing one. I don’t quite
> understand how this app behavior is related to repeated back-to-back TLPs
> problem caused by arming PTO after sending TLP. Btw, “TLP loops” looks like
> a brand new term used only in this paragraph.
>
>
>
> Thanks,
>
>
>
> Yi
>
>
>
>
>
> *From:* Yuchung Cheng <ycheng=40google.com@dmarc.ietf.org>
> *Sent:* Tuesday, May 7, 2019 9:52 AM
> *To:* Yi Huang <huanyi@microsoft.com>
> *Cc:* Yuchung Cheng <ycheng@google.com>; Matt Olson <maolson@microsoft.com>;
> tcpm@ietf.org
> *Subject:* Re: [tcpm] Questions about TLP
>
>
>
> Thanks for the review as well!
>
>
>
> *From: *Yi Huang <huanyi=40microsoft.com@dmarc.ietf.org
> <40microsoft.com@dmarc..ietf.org>>
> *Date: *Thu, May 2, 2019, 11:47 AM
> *To: *Yuchung Cheng, Matt Olson
> *Cc: *tcpm@ietf.org
>
> If TLPRxtOut is checked in TLP_send_probe(), a new PTO can cancel an
> already running RTO timer armed by sending TLP previously. If
> TLP_send_probe() decides not to send the probe for the new PTO, what should
> the sender do next? Rearm RTO or treat it as RTO timeout? The draft states
> we must arm an RTO after transmitting TLP but in this case, no probe will
> be sent.
>
> Also, in page 17, the draft says "This is important to avoid TLP loops if
> an application writes periodically at an interval less than PTO." but if
> the app writes periodically at an interval less than PTO, PTO will just be
> pushed out and not fire until the sender cannot send anything due to window
> limit. Then in this case, TLP loops do not seem to exist if TLP loops means
> back-to-back TLP probes. Am I missing anything here?
>
> Thanks,
>
> Yi
>
> -----Original Message-----
> From: Yuchung Cheng <ycheng=40google.com@dmarc.ietf.org>
> Sent: Thursday, May 2, 2019 10:25 AM
> To: Matt Olson <maolson@microsoft.com>
> Cc: Yi Huang <huanyi@microsoft.com>; tcpm@ietf.org
> Subject: Re: [tcpm] Questions about TLP
>
> On Thu, May 2, 2019 at 10:09 AM Matt Olson <maolson@microsoft.com> wrote:
> >
> > Also, section 6.5.1 Step 1 is presented as a complete set of conditions
> for scheduling a PTO, but it is missing a bullet point for TLPRxtOut.
> Thanks for checking: but checking TLPRxtOut is not necessary and is not
> preferred when scheduling a probe -- the goal is to prevent more than one
> probe inflight, so checking right before TLPRxtOut sending the next one
> allows the best coverage of application-limited writes
> (burst-idle-burst-....).
>
>
>
>
> >
> > -----Original Message-----
> > From: Yuchung Cheng <ycheng@google.com>
> > Sent: Wednesday, May 1, 2019 11:33 PM
> > To: Yi Huang <huanyi=40microsoft.com@dmarc.ietf.org
> <40microsoft.com@dmarc.ietf..org>>
> > Cc: tcpm@ietf.org; Matt Olson <maolson@microsoft.com>
> > Subject: Re: [tcpm] Questions about TLP
> >
> > On Wed, May 1, 2019 at 10:48 PM Yi Huang <huanyi=
> 40microsoft.com@dmarc.ietf.org> wrote:
> > >
> > > Hi RACK/TLP authors,
> > >
> > >
> > >
> > > I have the following questions (page numbers refer to
> draft-ietf-tcpm-rack-05):
> > >
> > >
> > >
> > > 1.In page 16, it says “Finally, if the time at which an RTO would fire
> (here denoted "TCP_RTO_expire") is sooner than the computed time for the
> PTO, then a probe is scheduled to be sent at that earlier time.” What does
> this TCP_RTO_expire actually mean? Does it mean there is another RTO timer
> (possibly started some time in the past) running along with PTO or just RTT
> + 4*RTTVar + Now()? Also, why would another probe be sent at RTO expiration
> time instead of treating it like RTO and collapsing the cwnd?
> >
> > TCP_RTO_expire() should return a timeout value for regular RTO, i.e.
> > SRTT + 4*RTTVAR + Now()
> > We use TCP_RTO_expire() because many implementations including Linux
> > do not use the exact RFC formula
> >
> > The reason it's not treated as a regular RTO is to avoid resetting
> congestion window to 1. The rationale is if the probe was sent within
> min(PTO, RTO) and was delivered successfully, there's no need to reset
> congestion window and re-start slow-start, similar to the rationale of
> Reno's fast recovery reducing window to ssthresh instead of 1. We have
> found this strategy benefits wireless connections, as the chance of
> spurious RTO is high due to the delay variation.
> >
> > >
> > >
> > >
> > > 2.In page 18 section 6.6, a new variable TLPRxtOut is defined and it
> is stated that TLPRxtOut is used to guarantee that there is only one
> outstanding TLP retransmission.. However, it is not clear to me when
> TLPRxtOut should be really used. Should it be used in 6.5.1 Step. 1
> (checking whether we should and can schedule a TLP) or in TLP_send_probe()?
> >
> > This is in
> >
> > 6.6.2.  Recording loss probe states
> >
> >    Senders MUST only send a TLP loss probe retransmission if TLPRxtOut
> >    is false.  This ensures that at any given time a connection has at
> >    most one outstanding TLP retransmission.  This allows the sender to
> >
> > but I agree it'd be more clear to include this in TLP_send_probe()
> pseudo code. I'd incorporate in the next rev.
> > >
> > >
> > >
> > > Thanks,
> > >
> > >
> > >
> > > Yi
> > >
> > >
> > >
> > > _______________________________________________
> > > tcpm mailing list
> > > tcpm@ietf.org
> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> ..
> > > ietf.org
> <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fietf.org&data=01%7C01%7Chuanyi%40microsoft.com%7Cbbf1ad4a524c4821031b08d6d30c646f%7C72f988bf86f141af91ab2d7cd011db47%7C1&sdata=y8%2F%2BCOxN%2FjAJiSPuZyIEgFkFZ9l7S%2FojnhmfwDRhk6U%3D&reserved=0>
> %2Fmailman%2Flistinfo%2Ftcpm&amp;data=01%7C01%7Cmaolson%40mi
> > > cr
> > > osoft.com
> <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fosoft.com&data=01%7C01%7Chuanyi%40microsoft.com%7Cbbf1ad4a524c4821031b08d6d30c646f%7C72f988bf86f141af91ab2d7cd011db47%7C1&sdata=NtY%2BZy379V4V7C1dlTRyjsmBtvGzDSNeFmcbZpRIT0s%3D&reserved=0>
> %7C66db44672e3d4101d72908d6cec82001%7C72f988bf86f141af91ab2
> > > d7
> > > cd011db47%7C1&amp;sdata=n8rp5Emowm459iSKolhIP1hFXBs8pZjsGG2iy3OBAno%
> > > 3D
> > > &amp;reserved=0
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ietf.org%2Fmailman%2Flistinfo%2Ftcpm&data=01%7C01%7Chuanyi%40microsoft.com%7Cbbf1ad4a524c4821031b08d6d30c646f%7C72f988bf86f141af91ab2d7cd011db47%7C1&sdata=95VwD1Wy13YO%2FksNzS4A9OjyfPEyuQKdGcKn9goi5sY%3D&reserved=0>
>
>