Re: [tcpm] Thank you for the QUIC session in tcpm

Neal Cardwell <ncardwell@google.com> Thu, 15 November 2018 17:15 UTC

Return-Path: <ncardwell@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 922BD12F1A5 for <tcpm@ietfa.amsl.com>; Thu, 15 Nov 2018 09:15:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.501
X-Spam-Level:
X-Spam-Status: No, score=-17.501 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Iwxcukaiaibq for <tcpm@ietfa.amsl.com>; Thu, 15 Nov 2018 09:15:47 -0800 (PST)
Received: from mail-oi1-x233.google.com (mail-oi1-x233.google.com [IPv6:2607:f8b0:4864:20::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 636FE124BAA for <tcpm@ietf.org>; Thu, 15 Nov 2018 09:15:47 -0800 (PST)
Received: by mail-oi1-x233.google.com with SMTP id r127-v6so17372516oie.3 for <tcpm@ietf.org>; Thu, 15 Nov 2018 09:15:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=OeN5+q3JItYrwtPS9KXoDWT17IhaPekTjIaYsRfTD+s=; b=A+Z0fFPjo2xXUZnxLsb4lvbYVfWb2KTGHtsP542rb+dhP097IholUWUyXxIukZVimE hmzzwDLwqAYwMUCfeTVqdzdk6xiN0C3/3qDTXB7ZH+biY0W5Y+Lpj7N4OdS5J4QLxlXC tADua60RQ0yJB+R83twQSe7CjIyCtSsSvV6HhCH6uzImOpR9lIahsyBGOrvAoHntnFHS PwkEKDO7cjBkwFBM61IIxkUF7ORzox30ZAA8pg2DR5OO4wAQmhecMzu4I0vywHYdJByX 7zpabbWgAyjfSmCg0MMeAzyC4RH+tGBQLjK60YBomkd0wsQR6hKjBda6GDPL6BZXQCA2 JpAA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OeN5+q3JItYrwtPS9KXoDWT17IhaPekTjIaYsRfTD+s=; b=ewuh8i9V0aWu+JXsWeyrWkE06TFUkkvrmFImgoxzSbY/o8QxULVGQXgQK/NMEgyojj 7eA85JVZTQDO2zlLLD6PeBVCyuBV7H7qXHlNIfLvdprK8FQZ2C9Z3DTea9HUqmCOXA6H UYnkoTGJUUc8EwPBAG1cUhYUHw9vjemrvDAeqnENh58TJZxTtHsjZKVwxUpcPoYag8r3 RiBziPQn8Y7slTG+cvF8bdFwIX0gDa4NzbADN9wfx1U1EMDou4LPKpxcx3k0dBewKHNJ 6LNZbuKWX3Eg2TcbA4p2qwviWOQmsc09+8Lmd8vWDofbwqXjQvcGqqDsEKbCtcyE6nO4 lEwg==
X-Gm-Message-State: AGRZ1gJ7/cPyjV/xLxgVQ3tFfk+UhiDPcB40LcN6kxoulSHZtMTaAnjd fD842YPz/QIgJPlzkKYZ5ekHMLTTjuhC5iHxVoHl7g==
X-Google-Smtp-Source: AJdET5cXL4IgM1GToed1BoTzzr5K3pRoiPAirp3o5e1wlzwJ2zfcI95CwJau06fMtdqwGPNZojOabu/4RDsVqc+pGpo=
X-Received: by 2002:aca:b05:: with SMTP id 5-v6mr4334537oil.157.1542302146112; Thu, 15 Nov 2018 09:15:46 -0800 (PST)
MIME-Version: 1.0
References: <dddc426c-b7e0-8446-d236-71bdba4010fe@bobbriscoe.net> <CAK6E8=eEQM++TqAS+wLWCwFbXuNcbRZV6Nnewz1+6nWhnfAuQQ@mail.gmail.com> <MW2PR2101MB1049AD006A7311CBB7D9D072B6C60@MW2PR2101MB1049.namprd21.prod.outlook.com> <CACpbDcfcxZBo6A_9yjefhedUWTFe0Ce2eZxyFFa8zvscTRtAKg@mail.gmail.com> <CAK6E8=eAfBMNrCsCLexg2fWxb2dOVOSeFPV81pFPaz+3TdyHsA@mail.gmail.com> <MW2PR2101MB1049420039F3E1C96A885A9EB6C10@MW2PR2101MB1049.namprd21.prod.outlook.com> <CACpbDcec6gdf-vhjHEF_4VgQMTUEa+H4PD4p1jq-7M3xJYiT+Q@mail.gmail.com> <fef73514-bf8c-6f65-0606-2b53c44b040a@gmail.com> <CACpbDce5C-zKwFCQEANhwmprZ47SeKWGqS1-hOhutty5RmhcaA@mail.gmail.com>
In-Reply-To: <CACpbDce5C-zKwFCQEANhwmprZ47SeKWGqS1-hOhutty5RmhcaA@mail.gmail.com>
From: Neal Cardwell <ncardwell@google.com>
Date: Thu, 15 Nov 2018 12:15:38 -0500
Message-ID: <CADVnQy==Ev2Kiqk6qx5X3H+d7=CEhuTpD_9j6Rpy1X+CgMr0-g@mail.gmail.com>
To: Jana Iyengar <jri.ietf@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>, "tcpm@ietf.org" <tcpm@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000008eb639057ab733fb"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/VkRuelK2YKDGi8ikgyOjE0iDtQU>
Subject: Re: [tcpm] Thank you for the QUIC session in tcpm
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Nov 2018 17:15:50 -0000

On Wed, Nov 14, 2018 at 11:18 PM Jana Iyengar <jri.ietf@gmail.com> wrote:

> Thanks for the clarification, Eric. I'm a bit surprised to see srtt be
> used by both Cubic and BBR, but that discussion's not for this thread.
>

Just to add context (sorry to dilute the thread!), the spot in Linux TCP
BBR that Eric pointed to that is using srtt is the initialization code
(bbr_init_pacing_rate_from_rtt()). The code is using srtt here to compute a
pacing rate (via pacing_rate = gain * cwd / srtt) largely because Linux TCP
can hot-swap congestion control modules. Because of that possibility, the
cwnd and srtt that BBR inherits at initialization may have been from a
buffer-bloated CUBIC/Reno connection, where srtt may have bloated in
proportion to cwnd, so that cwd / srtt is generally a better approximation
to the available bandwidth than cwnd / min_rtt.  As you would imagine,
after that initialization, Linux TCP BBR does not use srtt (or min_rtt) to
update the pacing rate; it uses a function of the estimated bandwidth.

best,
neal



> On Wed, Nov 14, 2018 at 9:44 PM Eric Dumazet <eric.dumazet@gmail.com>
> wrote:
>
>>
>>
>> On 11/12/2018 11:13 PM, Jana Iyengar wrote:
>> > On what to do with an unconfirmed RTO: So, my thought is that if you've
>> hit an RTO, it's been a REALLY LONG time, because we've gone through TLP(s)
>> by now and failed. (This might change if we decide to combine TLP and RTO,
>> but that's for a later discussion.)  It's not fully clear to me what the
>> right responses here are, but some thoughts follow.
>> >
>> > 1/ If this was an actual increase in the RTT, it would naturally roll
>> into the SRTT and RTTVAR. We might even consider resetting the RTT
>> estimator, but we already know that the EWMA estimator we love takes just a
>> while to catch on, and I don't want to litigate changing the estimator
>> fundamentally here.
>> >
>> > 2/ Linux fq uses min_rtt IIRC, which shouldn't change as a result of an
>> increase in RTT. This *might* be a good time to kick the min_rtt estimator
>> to pace more conservatively.
>>
>> linux fq does not use min_rtt, but the pacing rate set by TCP.
>>
>> pacing rate for Cubic is using SRTT (
>> https://elixir.bootlin.com/linux/latest/source/net/ipv4/tcp_input.c#L806
>> )
>>
>> For BBR, pacing rate depends on various factors and phases, but SRTT is
>> used, not min_rtt
>>
>> https://elixir.bootlin.com/linux/latest/source/net/ipv4/tcp_bbr.c#L231
>>
>>
>> >
>> > 3/ I'm not convinced that it makes sense to change the cwnd. It seems
>> clear that an RTT increase with the same BW should increase the cwnd if
>> anything, so it seems reasonable to continue with the same cwnd at least.
>> If the BW changed, hopefully that'll be reflected in a congestion signal,
>> but in the absence of a congestion signal, I think we should keep the cwnd
>> as is.
>> >
>> >
>> > On Tue, Nov 13, 2018 at 12:35 AM Praveen Balasubramanian <
>> pravb@microsoft.com <mailto:pravb@microsoft.com>> wrote:
>> >
>> >     Yuchung this might be worth experimenting with and getting more
>> data on. Maybe there should be some reaction to an unconfirmed RTO even if
>> it is not reduction all the way to LW? I like Jana's suggestion that the
>> implementation SHOULD drop the cwnd even for unconfirmed RTO if not pacing.
>> >
>> >     I checked the latest draft and pacing is a RECOMMENDED. I am
>> noticing that transport algorithms in Linux and in QUIC are evolving
>> assuming pacing which is a function usually implemented and configured
>> outside of the transport.. So the transport is potentially making an
>> assumption which may result in safety issues. For example BBR congestion
>> control may not work very well if pacing was disabled or misconfigured? I
>> don’t know of any robust solution to this other than build pacing into the
>> transport itself. Pacing is also very challenging for low latency flows
>> because of lack of support for fine grained timers.
>> >
>> >     -----Original Message-----
>> >     From: Yuchung Cheng <ycheng@google.com <mailto:ycheng@google.com>>
>> >     Sent: Monday, November 12, 2018 8:37 AM
>> >     To: Jana Iyengar <jri.ietf@gmail.com <mailto:jri.ietf@gmail.com>>
>> >     Cc: Praveen Balasubramanian <pravb@microsoft.com <mailto:
>> pravb@microsoft.com>>t;>; Bob Briscoe <ietf@bobbriscoe.net <mailto:
>> ietf@bobbriscoe.net>>t;>; Ian Swett <ianswett@google.com <mailto:
>> ianswett@google.com>>t;>; tcpm@ietf.org <mailto:tcpm@ietf.org> Extensions <
>> tcpm@ietf.org <mailto:tcpm@ietf.org>>
>> >     Subject: Re: [tcpm] Thank you for the QUIC session in tcpm
>> >
>> >     On Mon, Nov 12, 2018 at 4:13 AM, Jana Iyengar <jri.ietf@gmail.com
>> <mailto:jri.ietf@gmail.com>> wrote:
>> >     > Praveen,
>> >     >
>> >     > The point you're raising -- that we've lost the ack clock after
>> an RTO
>> >     > -- is a reasonable point. My argument is that with pacing, cwnd
>> >     > reduction is unwarranted because in the extreme case, this
>> collapses
>> >     > to restart after idle, where sending at cwnd with pacing is
>> reasonable
>> >     I agree w/ Jana as well - the most generic way to treat burst due
>> to prior_inflight << cwnd is pacing. With QUIC's approach, when RTO fires,
>> cwnd remains unchanged until the ACK of the first retransmission (i.e. a
>> probe packet) comes back. Therefore the delay is one RTT and the potential
>> damage is an additional cwnd-worth of burst.
>> >
>> >     Yes the worst case is more aggressive, and it can be too much for
>> DC incast case if pacing isn't supported - one idea is to selective enable
>> that if RTT variance is very large vs RTT.
>> >
>> >     >
>> >     > The draft does not say that a sender should reduce the cwnd if it
>> is
>> >     > not pacing, which we should add. Does that make sense?
>> >     >
>> >     > - jana
>> >     >
>> >     > On Fri, Nov 9, 2018 at 8:34 AM Praveen Balasubramanian
>> >     > <pravb=40microsoft.com@dmarc.ietf.org <mailto:
>> 40microsoft.com@dmarc.ietf.org>> wrote:
>> >     >>
>> >     >> Yuchung I brought that difference up in an email to the quic wg
>> >     >> earlier this week.
>> >     >>
>> >     >> In app send limited case, inflight could be very small compared
>> to cwnd.
>> >     >> So in QUIC there is potential to send a burst out after a long
>> idle
>> >     >> period (with outstanding data) where TCP wouldn't. The draft
>> claims
>> >     >> this is okay to do because RTO may have been a result of RTT
>> increase
>> >     >> instead of loss. Is there data to suggest on which side we should
>> >     >> err? i.e. data on what are the chances that an RTO is due to an
>> RTT increase versus loss.
>> >     >>
>> >     >> Do you see any safety concerns with delayed reduction of cwnd in
>> case
>> >     >> where the RTO is not spurious?
>> >     >>
>> >     >> -----Original Message-----
>> >     >> From: tcpm <tcpm-bounces@ietf.org <mailto:tcpm-bounces@ietf.org>>
>> On Behalf Of Yuchung Cheng
>> >     >> Sent: Thursday, November 8, 2018 4:38 PM
>> >     >> To: Bob Briscoe <ietf@bobbriscoe.net <mailto:ietf@bobbriscoe.net
>> >>
>> >     >> Cc: Ian Swett <ianswett@google.com <mailto:ianswett@google.com>>;
>> tcpm IETF list <tcpm@ietf.org <mailto:tcpm@ietf.org>>
>> >     >> Subject: Re: [tcpm] Thank you for the QUIC session in tcpm
>> >     >>
>> >     >> On Thu, Nov 8, 2018 at 3:14 AM, Bob Briscoe <ietf@bobbriscoe.net
>> <mailto:ietf@bobbriscoe.net>> wrote:
>> >     >> >
>> >     >> > I just wanted to thank Jana for explaining QUIC loss recovery
>> to us
>> >     >> > (and QUIC CC as far as it goes).
>> >     >> > And thank you Jana, Ian, the chairs of both WGs (and anyone
>> else
>> >     >> > involved) for setting it up.
>> >     >> >
>> >     >> > If one is not full-time on QUIC, it's very difficult to keep up
>> >     >> > with all the changes. But now we have a checkpoint to start
>> from, I
>> >     >> > feel I will not be wasting people's time if I try to get
>> involved -
>> >     >> > at least I only might say something un-QUIC occasionally,
>> rather
>> >     >> > than nearly always. This has allowed people who understand how
>> TCP
>> >     >> > cold be improved to help with QUIC, when working on QUIC isn't
>> their day job.
>> >     >> >
>> >     >> > Again, Thank you.
>> >     >> I like particularly that QUIC only reduces cwnd to 1 after the
>> loss
>> >     >> is confirmed not upon RTO fires. It should be very feasible for
>> TCP
>> >     >> (at least
>> >     >> Linux) w/ TCP timestamps. It'll save a lot of spurious cwnd
>> reductions!
>> >     >>
>> >     >> Also IMHO TCP w/ quality timestamps are almost as good as QUIC
>> pkt-ids.
>> >     >> Google internally uses usec. We wish we could upstream it but RFC
>> >     >> needs to be updated.
>> >     >>
>> >     >> >
>> >     >> >
>> >     >> > Bob
>> >     >> >
>> >     >> >
>> >     >> > --
>> >     >> >
>> ________________________________________________________________
>> >     >> > Bob Briscoe
>> >     >> >
>> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbob
>> >     >> > briscoe.net <http://briscoe.net>%2F&amp;data=02%7C01%7Cpravb%
>> 40microsoft.com <http://40microsoft.com>%7C3c4d22866
>> >     >> >
>> 8444691c16808d648bd2ce9%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
>> >     >> >
>> 7C636776374678321324&amp;sdata=T%2BWTEc42VC%2Bz%2BsmMFyjlm37hmAwfee
>> >     >> > buPcuMYlVhgDY%3D&amp;reserved=0
>> >     >> >
>> >     >> > _______________________________________________
>> >     >> > tcpm mailing list
>> >     >> > tcpm@ietf.org <mailto:tcpm@ietf.org>
>> >     >> >
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>> >     >> > w.i
>> >     >> > etf.org <http://etf.org
>> >%2Fmailman%2Flistinfo%2Ftcpm&amp;data=02%7C01%7Cpravb%40micr
>> >     >> > oso
>> >     >> > ft.com <http://ft.com
>> >%7Cf0911eeb74d7446f424508d645dbb779%7C72f988bf86f141af91ab2d7
>> >     >> > cd0
>> >     >> >
>> 11db47%7C1%7C0%7C636773207316711149&amp;sdata=K667a3IQG4rarQ%2FOfAl
>> >     >> > yhK
>> >     >> > QQ05Cea421rgb64DlEMvs%3D&amp;reserved=0
>> >     >>
>> >     >> _______________________________________________
>> >     >> tcpm mailing list
>> >     >> tcpm@ietf.org <mailto:tcpm@ietf.org>
>> >     >>
>> >     >>
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
>> >     >> ietf.org <http://ietf.org
>> >%2Fmailman%2Flistinfo%2Ftcpm&amp;data=02%7C01%7Cpravb%40micro
>> >     >> soft.com <http://soft.com
>> >%7C3c4d228668444691c16808d648bd2ce9%7C72f988bf86f141af91ab2d7
>> >     >>
>> cd011db47%7C1%7C0%7C636776374678321324&amp;sdata=33WxKh5c6qL4ln5jerpx
>> >     >> Rhytfrj1iTK284pEqv5fWKY%3D&amp;reserved=0
>> >     >>
>> >     >> _______________________________________________
>> >     >> tcpm mailing list
>> >     >> tcpm@ietf.org <mailto:tcpm@ietf.org>
>> >     >>
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
>> >     >> ietf.org <http://ietf.org
>> >%2Fmailman%2Flistinfo%2Ftcpm&amp;data=02%7C01%7Cpravb%40micro
>> >     >> soft.com <http://soft.com
>> >%7C3c4d228668444691c16808d648bd2ce9%7C72f988bf86f141af91ab2d7
>> >     >>
>> cd011db47%7C1%7C0%7C636776374678321324&amp;sdata=33WxKh5c6qL4ln5jerpx
>> >     >> Rhytfrj1iTK284pEqv5fWKY%3D&amp;reserved=0
>> >
>> >
>> >
>> > _______________________________________________
>> > tcpm mailing list
>> > tcpm@ietf.org
>> > https://www.ietf.org/mailman/listinfo/tcpm
>> >
>>
>> _______________________________________________
>> tcpm mailing list
>> tcpm@ietf.org
>> https://www.ietf.org/mailman/listinfo/tcpm
>>
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm
>