Re: [tcpm] TLP questions

Neal Cardwell <ncardwell@google.com> Mon, 14 May 2018 21:21 UTC

Return-Path: <ncardwell@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6856C126D73 for <tcpm@ietfa.amsl.com>; Mon, 14 May 2018 14:21:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -18.211
X-Spam-Level:
X-Spam-Status: No, score=-18.211 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GmVXMBeXcP2I for <tcpm@ietfa.amsl.com>; Mon, 14 May 2018 14:21:56 -0700 (PDT)
Received: from mail-wm0-x22e.google.com (mail-wm0-x22e.google.com [IPv6:2a00:1450:400c:c09::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2CFB312E8A5 for <tcpm@ietf.org>; Mon, 14 May 2018 14:21:56 -0700 (PDT)
Received: by mail-wm0-x22e.google.com with SMTP id f6-v6so15930466wmc.4 for <tcpm@ietf.org>; Mon, 14 May 2018 14:21:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=SnTlW92h8QWC1s5iZugEdiCN9x/Txb2k8zopJdnGWs8=; b=ckEvLmWSECKtYfcxfaU0LeL3N9D4wrOjZBXgxkco6HFw9nj7OoV75Xe+PfTmxi6MXz 7h2Th0gf4OqCk1m2KDSrz13ZjcV50xYbJDxMW9AsXW6SLQOLONbsLgJ14/MOhnA9Ydlc JtK0HJTE2e8nyZ1hbvEAyvDAte3Evb0dFyfR6RkJ99WQ3xhF6+J5Hh5ejVBdvDecTrSu 9vyTVStXw3fqBeTpuXMW80UwFzoqRYXwAmBab7u3A0lYYP+SOUW8nbEcmg0CN6oEdlPk G5HPy+7jS2fGsmx9Ou6GMu6F0soCWEOzOXqrdvY5ses84IIFrZs9J41/bnt9qDrwq5sd 2FEg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=SnTlW92h8QWC1s5iZugEdiCN9x/Txb2k8zopJdnGWs8=; b=oylXuzoWQkF91qQPK8nEZ3ociF0AUJ6WvZHk3Pk+4NL2coFQWXQNUQ4guBIDWmGgkn qoiAhGdgmOcDtCC1uBug5oWmueE3x3ZYu2MGyC4zYoxto9cUvVfnDKTs/7V+y//4nLPR D+LEUVCXyOP1j8Mi4B0+N109rFHcZYWRthiYnPEyTaHR+deQg2E5kiyEDGcZbZzVZgeI 0hB+1zbEZF+4Lfm/OIclDOWjBxYt1gFg6wQf40L7y9ujglYy4nX2yYyYDLvfWDaLBQS2 WFUM9yByosteD1zFkQtK2JpiUKTKheVkKzXO9LIbrQ1/ekxmegC6MOol2vF94dyW02TU sdqA==
X-Gm-Message-State: ALKqPwe7Ei3knq2/73VfeL14dBXl8rPFrqRUVXAKec7Pz9sGpWJHGUfv OzAORDn3Wk1gEp29NbAvOP7EG4TXlDrkkBQMXGJ+gQ==
X-Google-Smtp-Source: AB8JxZpsmIMw9abcq31r/lx/34B5ZKa/yB1PFoqVhNrAJDdjL5493qi5YFC+mlnDY0kjNnhOWv0OXHh4+XCTbGbD/zY=
X-Received: by 2002:a1c:ecc5:: with SMTP id h66-v6mr5767330wmi.147.1526332914318; Mon, 14 May 2018 14:21:54 -0700 (PDT)
MIME-Version: 1.0
References: <CY4PR21MB063011EB9ABCD23BABC2EDC0B6990@CY4PR21MB0630.namprd21.prod.outlook.com> <CY4PR21MB0630AF5B03B8C260AD72E366B6990@CY4PR21MB0630.namprd21.prod.outlook.com> <CADVnQyk04js7VaFdUKFYg6h8yE2ZzoDMG_EPeS_hKYb_tnesww@mail.gmail.com> <CAO249ydhwbnGvJNdHwJGBO6h_++mHKY6Xe+n+vFX4vg9rsvuhQ@mail.gmail.com>
In-Reply-To: <CAO249ydhwbnGvJNdHwJGBO6h_++mHKY6Xe+n+vFX4vg9rsvuhQ@mail.gmail.com>
From: Neal Cardwell <ncardwell@google.com>
Date: Mon, 14 May 2018 17:21:36 -0400
Message-ID: <CADVnQykaMgWWbgTYD_8rdh61wUqBmAhcSObxRa4-g8NcQtoqPw@mail.gmail.com>
To: Yoshifumi Nishida <nishida@sfc.wide.ad.jp>
Cc: Praveen Balasubramanian <pravb@microsoft.com>, Priyaranjan Jha <priyarjha@google.com>, "tcpm@ietf.org" <tcpm@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/kLHR9chGbUr3GzkLDr3YDdUXUBU>
Subject: Re: [tcpm] TLP questions
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 May 2018 21:21:58 -0000

On Thu, May 10, 2018 at 7:08 PM Yoshifumi Nishida <nishida@sfc.wide.ad.jp>
wrote:

> Hi Neal,

> On Wed, May 9, 2018 at 1:27 PM, Neal Cardwell <ncardwell@google.com>
wrote:
> > On Wed, May 9, 2018 at 2:29 PM Praveen Balasubramanian <
pravb@microsoft.com>
...
> >> “If a previously unsent segment exists AND
> >>
> >>          the receive window allows new data to be sent:
> >>
> >>            Transmit that new segment
> >>
> >>            FlightSize += SMSS
> >>
> >>        Else:
> >>
> >>            Retransmit the last segment”
> >>
> >> This needs to be crisper about what is meant by “previously unsent” and
> >> “last segment”. For example the sender could have sent a large amount
of
> >> data and then taken a full RTO. In this case if PTO fires, do
“previously
> >> unsent” and “last segment” refer to MSS size segments straddling just
before
> >> and  after SND.NXT? OR do they straddle around the largest sent
sequence
> >> number ever in the connection lifetime?
> >
> >
> > Very good questions. I like your suggestion to be crisper in this
section.
> > The Linux TCP stack does not "rewind" SND.NXT upon RTO, so in the Linux
TCP
> > stack the SND.NXT point and "largest sent sequence number ever in the
> > connection lifetime" are basically the same point. That is the
framework in
> > which we were thinking for those lines.
> >
> > In my mind...
> >
> > By "a previously unsent segment" we mean basically "the next segment
(of MSS
> > or fewer bytes) that the sender would normally send if it had available
cwnd
> > at this time." That is something that presumably every
production-quality
> > TCP stack has very quick access to.
> >
> > By "the last segment" we mean "the highest-sequence segment (of MSS or
fewer
> > bytes) that has already been transmitted and not ACKed or SACKed." I
imagine
> > this should also be generally very quick to access (at least it can be
> > quickly accessed in two generations of the Linux TCP write queue). Let
us
> > know if not, and we can discuss.
> >
> > Does that help clarify those parts? If so, we can update the text to
> > incorporate something like that (suggestions?).

> I am thinking that we can just say "send one (SMSS or smaller) segment
> that contains up to highest sequence number it has sent."
> I am a bit wondering if not ACKed or SACKed requirement should be
> mandatory, although it will be a bit redundant.

Yes, good point; the "not ACKed or SACKed" I added in my email message was
redundant.

So we could replace:
   "Retransmit the last segment"
with:
   "Retransmit the highest-sequence segment sent so far"

> > A sender should schedule a PTO only if all of the following conditions
are
> > met:
> >
> > The connection supports SACK [RFC2018]

> Do we need to check this? I mean RACK already presumes SACK nodes. I
> thought TLP does as well.
> Or, do we want to use TLP with non-SACK nodes?

Yes, both RACK and TLP require SACK. I agree that it could be considered
redundant to specify the SACK dependency in the TLP section, given that we
have already  mentioned in section "3. Requirements" that RACK depends on
SACK. But IMHO it is worthwhile to keep this dependency separately listed
here for TLP, to try to make it clear/explicit that the TLP algorithm has
its own dependency on SACK.

> > The connection has no SACKed sequences in the SACK scoreboard
> >
> > The connection is not in loss recovery
> >
> > The most recently transmitted data was not itself a TLP probe (i.e. a
sender
> > MUST NOT send consecutive or back-to-back TLP probes).

> I think the third condition might be a bit ambiguous and may require
> another variable. If we check TLPRtxOut and TLPHighRxt, might it be
> sufficient?

Yes, I agree this condition is not as clear as it should be. There are
really several considerations:

(a) we don't want an unterminating chain of TLPs, where after sending a TLP
probe we rearm the PTO timer and send another TLP

(b) we can't allow multiple outstanding TLP retransmissions

(Checking TLPRxtOut would cover (b) but not (a). And checking TLPRxtOut is
discussed elsewhere in the draft.)

The way the Linux TCP TLP code handles this as follows:

For (a), we ensure this structurally in the call sequence: after sending a
TLP probe, we rearm the RTO timer, not the TLP timer. This is already
discussed in the section "Phase 2: Sending a loss probe".

For (b), before sending a TLP retransmission we effectively check TLPRxtOut
and do not send a TLP at all if there is an outstanding TLP retransmission.
This is already discussed in "Recording loss probe states".

Given that those issues are already discussed elsewhere, I'd propose to
remove that line "The most recently transmitted data was not itself a TLP
probe (i.e. a sender MUST NOT send consecutive or back-to-back TLP
probes).". Instead in the -04 draft I'd propose we just add a brief note at
the end of the "Phase 2: Sending a loss probe" section:

   Note that after transmitting a TLP, the sender MUST arm an RTO timer,
   and not the PTO timer. This ensures that the sender does not send
repeated,
   back-to-back TLP probes.

Thank you, Yoshifumi!

neal