Re: [tcpm] A question about PTO/RTO rescheduling in RACK draft

Neal Cardwell <ncardwell@google.com> Wed, 17 April 2019 20:30 UTC

Return-Path: <ncardwell@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EF13612016C for <tcpm@ietfa.amsl.com>; Wed, 17 Apr 2019 13:30:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.5
X-Spam-Level:
X-Spam-Status: No, score=-17.5 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6APFIFKzcrxm for <tcpm@ietfa.amsl.com>; Wed, 17 Apr 2019 13:29:59 -0700 (PDT)
Received: from mail-ot1-x32f.google.com (mail-ot1-x32f.google.com [IPv6:2607:f8b0:4864:20::32f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 98B86120163 for <tcpm@ietf.org>; Wed, 17 Apr 2019 13:29:57 -0700 (PDT)
Received: by mail-ot1-x32f.google.com with SMTP id f10so21903329otb.6 for <tcpm@ietf.org>; Wed, 17 Apr 2019 13:29:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Vv9W7//mBzaykFZEaBdtO97v6kb7WizCHVP4WXMFbnY=; b=SCzP56IfibUTpQkoOYZLcdynQARJhNPKSX9reioXVAhgyBKmq4PNURuEQFqofG6FhD uD7JaWS7+ybNZ2KpibxC3mXqJw4G27acSOiMdFjF7IfocEbgjEFY7/pvJ4crss6na1z/ 2jDEH0w3dK0zPocw+sCYbIuVYAo4kMoye9j4HiozEJgjUfXNLkoDKFAZhDJBNBeeLxg7 FXx09AeIu8HfN/E0wngIuAYEr+DCJseV3N+M+cSOTy6KoIO53xxHWv14Mjm7+zgMlgaN 1NOoTNrVvovGDGtHmoNv+ktcr1iSlELavYuK0A9VrC28QJukYL5A/r1OKdOCD0XkFH/j EVpg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Vv9W7//mBzaykFZEaBdtO97v6kb7WizCHVP4WXMFbnY=; b=UfXteAVTksn1aNmoa5bzyfE7V27dTS6lEeUwB9hS1XMnvy3iQ56XYJkb8XMYjLt84W ocFzQzMmgNaFSDhFMNGFsb0zUk5E4CtfSSIaaopoJPxmQiDUZUlhjvju5A1cbOLwrEFg S4wO+zDhO0NATxJyEGli24ySOD9h8UezQoAy7SV3yDztHSgdJ7KAtZU/g+r724eSffUt vCTDvMJhOLVRvJ9ZCkW3l1xARJaAzJBGr59re7+8Y+pI2T+j0y3fltNaoL6y4Ix0igCh BvuM++ekJuqP0oWAQtOB3qZZdLdRrazCg/yVNp4gh6jwXgQI3Z8dqmQ/+x8n4/liYPaJ DXTA==
X-Gm-Message-State: APjAAAVZWjBNxCsgJDefFGdrgR1wltXIOBYA4P9FHBOzWRa9cVNc7N/X q1MVTdyASCoAjxC0fqiJiSK3MzjbeC15dAfPZ7m4PA==
X-Google-Smtp-Source: APXvYqxhPUiT77l/FT8tCAZNGF3ido1HsN5zR6OZrQ4ojLXIxdi1aflEmYdrN10GBz702Pf/0Fc+Cgfmtb1UcMq2N1c=
X-Received: by 2002:a05:6830:1398:: with SMTP id d24mr57683373otq.104.1555532996457; Wed, 17 Apr 2019 13:29:56 -0700 (PDT)
MIME-Version: 1.0
References: <BL0PR2101MB104347EF7FC7CD5C86DF08B7C3250@BL0PR2101MB1043.namprd21.prod.outlook.com> <20190417181344.GM31257@strugglingcoder.info> <BYAPR21MB12568E60F973DB41C4905D8EBC250@BYAPR21MB1256.namprd21.prod.outlook.com> <20190417192202.GN31257@strugglingcoder.info>
In-Reply-To: <20190417192202.GN31257@strugglingcoder.info>
From: Neal Cardwell <ncardwell@google.com>
Date: Wed, 17 Apr 2019 16:29:40 -0400
Message-ID: <CADVnQym8cSACWmbbOSugb-QRcxpuaYTVKGPD=XX5i0AQH8m-JQ@mail.gmail.com>
To: hiren panchasara <hiren@strugglingcoder.info>
Cc: Matt Olson <maolson@microsoft.com>, draft-cheng-tcpm-rack@ietf.org, "tcpm@ietf.org" <tcpm@ietf.org>, Yi Huang <huanyi=40microsoft.com@dmarc.ietf.org>, Yuchung Cheng <ycheng@google.com>, Nandita Dukkipati <nanditad@google.com>, Priyaranjan Jha <priyarjha@google.com>
Content-Type: multipart/alternative; boundary="000000000000b13b1c0586bfbfca"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/MKhg6k5wtms9MaA0lhX3IIm2lxE>
Subject: Re: [tcpm] A question about PTO/RTO rescheduling in RACK draft
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Apr 2019 20:30:01 -0000

Hi Yi and Hiren,

Thank you for raising this question and the continued discussion.

I think there are a few concrete mechanisms that kick into play in the kind
of scenario Yi lays out.

(1) The TLP is scheduled at the min of the "native" TLP time and the
"native" RTO time, as laid out in section 6.5.1
<https://tools.ietf.org/html/draft-ietf-tcpm-rack-04#section-6.5.1>.:

   TLP_timeout():
  ...

       If Now() + PTO > TCP_RTO_expire():
           PTO = TCP_RTO_expire() - Now()


   This keeps the TLP (PTO) from being pushed past the time at which an RTO
previously would have fired.

(2) After the TLP probe is sent, an RTO timer is scheduled and fires. The
intent is to not get stuck in cycles of repeated TLPs.

To illustrate more concretely, below is a (passing) packetdrill script
illustrating/documenting the Linux TCP behavior in this kind of scenario,
showing the points at which a loss probe and timeout happen.

We will need to update the draft to make these aspects more clear. Thank
you for raising this!

neal

------------------
// Test for TLP with an app that makes periodic small writes.
`nstat > /dev/null`

// Establish a connection.
    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

   +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
   +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
  +.1 < . 1:1(0) ack 1 win 257
   +0 accept(3, ..., ...) = 4

// Small write of 2*MSS.
   +0 write(4, ..., 2000) = 2000
   +0 > P. 1:2001(2000) ack 1
   +0 %{ assert tcpi_ca_state == TCP_CA_Open }%
// TLP is scheduled 200ms later, but app writes again before that.

// Small write of 2*MSS, which pushes back TLP timer.
+.150 write(4, ..., 2000) = 2000
   +0 > P. 2001:4001(2000) ack 1
   +0 %{ assert tcpi_ca_state == TCP_CA_Open }%

// Loss probe is a retransmission, scheduled at a time
// calculated by the RTO formula, 300ms after the
// first packet we sent.
+.150 > P. 3001:4001(1000) ack 1
   +0 %{ assert tcpi_ca_state == TCP_CA_Open }%
   +0 `nstat -a -z | grep LossProbes`

// Small write of 2*MSS.
+.150 write(4, ..., 2000) = 2000
   +0 > P. 4001:6001(2000) ack 1
   +0 %{ assert tcpi_ca_state == TCP_CA_Open }%

// RTO timer fires.
+.210 > . 1:1001(1000) ack 1
   +0 %{ assert tcpi_ca_state == TCP_CA_Loss }%
   +0 `nstat -a -z | grep TCPTimeouts`