Re: [tcpm] A question about PTO/RTO rescheduling in RACK draft

Neal Cardwell <> Thu, 18 April 2019 13:19 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 54BC512008A for <>; Thu, 18 Apr 2019 06:19:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -17.502
X-Spam-Status: No, score=-17.502 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=unavailable autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id MnMPjgOq8uBn for <>; Thu, 18 Apr 2019 06:19:17 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 22A26120077 for <>; Thu, 18 Apr 2019 06:19:16 -0700 (PDT)
Received: by with SMTP id n187so1565390oih.6 for <>; Thu, 18 Apr 2019 06:19:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=svEd+0FS7yObtCUV9MmbG8YjqkNbbdiezBSi2uRj6RM=; b=IBM/CL9nA4vnBtPSKOUPabD549nmfxUz3ekHNQgcdXIcgL1ayyNCfeuBs+jFt5mlAV BUjlKGniuBc0ll7Z2iSVNozIacjRVKnxtxzEcgmg6xzhtQhJ70RWXlW3N0wer+BNXTkY mSNLVbX+ASB7s+qb2DXwaNZ0LO8DpoyeW4zBO8Lqieuhm3NtkStZT0fK2uVBv38Py5Zq XAppRiXJ4JUoBdVXWIpBiKWGl5obUmkajzCnqR54wNAP1UjxPutO+fPO08udWKi81UzG vRpuZps6Nfl9dcML0d1zG4r7myJeZ7ULY8JSZxluCYDnaiNso8FZgfN/Z2hoBwEknbSo iVqw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=svEd+0FS7yObtCUV9MmbG8YjqkNbbdiezBSi2uRj6RM=; b=M9ma82i0/msS+YdEf3355ZVrgwVN5QEBVtLK5qqkH1QupCrvDaTsl/rRCrq/RiYa4i g6qy0zOFX0OOgjvk2YJQPmJDQOKDNBAbIfjqV6cgVQbb2otZ9qWY71HeO5KNaqf9J2lg P8GEac+AVVnbtovIBT099fsDptaXxlwGV/3dxAYqoWjsEwOFmOmUoheLzjEBWk2OyyP0 b3ATgoa5XhTdePxrQxu7vFeVkTmjxZI+M3hqMsMum5Vfqgx8OMXs0ts6tOh4FjbDRamD 9eoHUiR1MTIneu1FDgpfwBHXbw91hnodpyBzspVTY2RtKTHnndB6vCGUa9qRbYnOnmDb MeMQ==
X-Gm-Message-State: APjAAAUgaUXHVf/jkdomrPUvq4xLnPlpTnZNDcA/Ia4hGPdwdbgUFXWW 9gEcGVfEjyRWWLnAKoxlhRHowSaaVUIXW48Pm8FTGA==
X-Google-Smtp-Source: APXvYqxULIYdpBZ9JiIJF9WXEg5Hb+UnypNcMRwnjkBg7EzeFu0tcZY1Y4hik4cp0UlzixIAbgToxZLNN6I//avYAIs=
X-Received: by 2002:aca:6086:: with SMTP id u128mr1781014oib.79.1555593554992; Thu, 18 Apr 2019 06:19:14 -0700 (PDT)
MIME-Version: 1.0
References: <> <> <> <> <> <>
In-Reply-To: <>
From: Neal Cardwell <>
Date: Thu, 18 Apr 2019 09:18:57 -0400
Message-ID: <>
To: hiren panchasara <>,
Cc: Matt Olson <>, "" <>, Yi Huang <>, Yuchung Cheng <>, Nandita Dukkipati <>, Priyaranjan Jha <>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <>
Subject: Re: [tcpm] A question about PTO/RTO rescheduling in RACK draft
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 18 Apr 2019 13:19:19 -0000

On Wed, Apr 17, 2019 at 5:15 PM hiren panchasara
<> wrote:
> I'll keep this to the point I thought got missed.
> Is where we can see `Scheduling a loss probe` getting changed.
> Specifically, following condition has been removed from -04:
>    3.  The connection is either limited by congestion window (the data
>        in flight matches or exceeds the cwnd) or application-limited
>        (there is no unsent data that the receiver window allows to be
>        sent).
> And afaict, Linux code still has this condition in. If you can provide
> some rationale behind this change, it'd be great.

We removed the code for that condition from Linux around 2017-12-13,
in this commit (GPLv2 patch in this link):

The commit description explains the rationale for removing that
condition, but here is a paraphrased description of the rationale for
removing that code and that condition in the draft:


Disallowing TLP when there is unused cwnd had the primary effect of
disallowing TLP when there is TSO deferral, Nagle deferral, or we hit
the receiver window limit. Why? Because basically every application
write() or incoming ACK will cause us to run the TCP transmit loop to
see if we can send more. And then if we sent something we then see if
we should schedule a TLP. At that point, there are a few common
reasons why some cwnd budget  could still be unused:

    (a) receiver window limits
    (b) Nagle deferral
    (c) TSO deferral (deferring with the hope of sending a bigger TSO
offload burst later)
    (d) intra-send-host flow control (in Linux, TCP small queues, aka TSQ)

For (d), after the next packet tx completion the TSQ mechanism will
allow us to send more packets, so we don't really need a TLP (in
practice it shouldn't matter whether we schedule one or not). But for
(a), (b), or (c) the sender won't send any more packets until it
receives another ACK. But if the whole flight of data was lost, or all
the ACKs for the flight were lost, then the sender won't get any more
ACKs, and so in this case ideally we should schedule and send a TLP to
get more feedback.