[tcpm] Re: using SACK info for RTTM?
Neal Cardwell <ncardwell@google.com> Wed, 05 June 2024 18:57 UTC
Return-Path: <ncardwell@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8B3D0C18870B for <tcpm@ietfa.amsl.com>; Wed, 5 Jun 2024 11:57:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.597
X-Spam-Level:
X-Spam-Status: No, score=-17.597 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BYnbKSCMg5gV for <tcpm@ietfa.amsl.com>; Wed, 5 Jun 2024 11:57:14 -0700 (PDT)
Received: from mail-ua1-x933.google.com (mail-ua1-x933.google.com [IPv6:2607:f8b0:4864:20::933]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9A921C180B61 for <tcpm@ietf.org>; Wed, 5 Jun 2024 11:57:14 -0700 (PDT)
Received: by mail-ua1-x933.google.com with SMTP id a1e0cc1a2514c-80b26f1cadaso48081241.3 for <tcpm@ietf.org>; Wed, 05 Jun 2024 11:57:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717613833; x=1718218633; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=2x96IvrvbtIMpj2eGZWko5YdxYstiZRV7Yyd8KltJbs=; b=HriGOZFIunrJFqC1fcDkLtZbSj6lrfVn/yHugwmwdnvXKNsH/Gcu+DvW071c4zeI2A 7X4+OipDMW/sdDA0cW1nmCXTAXYj+dQ8WqrVHcl6Le42Ra4YbXh8U/h7B4oTk+oQqvNe FVK43VBiJRBqNyYkkTVQFf2wZReNJSg15tfVeZcWkzQIXWzGiAzXKo8T4uuNckpH6hhG keO2abjiVzNPMYezuMTZ14Xm98ulX893iRCFbF2GYCKwgEXKEITSE2q2+CSxIWqBzNnm 0UL12jrCjpPCef1GOfrGr+gXaI/pY6RWCWBDWZk6UTESd7CdD8YVaZmscEaladZhUTue dcFA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717613833; x=1718218633; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2x96IvrvbtIMpj2eGZWko5YdxYstiZRV7Yyd8KltJbs=; b=eJPcjt1jO879pG2N21ZvB4gO3m4OknLXf4cLBDqwdMYyRlSJvkgPB9MrjivyCK2RX8 7GmFjRUoizX3UrTcKnjAkG6/n0abR8E7n1zNapE5TP00gFfy7lVZ2w+VQBkYanI8Stbd +AlP5ZbcBMa3aIvfj/md7qw0wT1f8yEC9FZH3WKkb9eH9BDF8oQNOuzS03DVHMLS/IiV x+NbwHmI7zoJgbBaDicQMy8b4Rs4hGLWqxyGoMrJZM5Y5msH5dSG1uvTSu+tBA2DPvNc RDKy2wkuvN1L3Wju9EMOrTREVMgFJL9BCFpgysCmEbC6aNYub2Nh18q+W2y58zBlGMtn QN5A==
X-Forwarded-Encrypted: i=1; AJvYcCWv1ALrGY6O8Lolh+gOyHpl5ChA3wBV+uhVL8+vlntOUODf8adyuFn/AJHpbYCgvDiBrLE1ZDTnulTOhhuc
X-Gm-Message-State: AOJu0Yz2/mFi3xO2HoAMLRKALo8m4oYHJ+XQyqjgBhyjGccy8na02XNz M2fSNY+zYxkD3DjWC7kIeLEmPoHYvI94Vib6zxXTPkIS6HrwldMcenEWtoLSMQCz3K7eVhOweWD wtAgTt88vVRDaCbT37+sNHpcLz1CNEd13jjUpEww73qcoBv/UJzMYhys=
X-Google-Smtp-Source: AGHT+IHsJuRRf5DwTP7hgSF78d9u4s825BPwnL6wFyiv87v870QQSgFssEZlYJIV17UPRM+OYXm0l/ByND8qi1/ii60=
X-Received: by 2002:a05:6102:4b1:b0:47c:2123:4539 with SMTP id ada2fe7eead31-48c0489d691mr3776530137.20.1717613831535; Wed, 05 Jun 2024 11:57:11 -0700 (PDT)
MIME-Version: 1.0
References: <CAAK044QOLRucPZBzeTRBj=m83aXVsFq83zJQgmvYuVVwKTHzFA@mail.gmail.com> <CADVnQy=4Lqsbx_cMgK05ydrYNUbg-tiX8r3ZDmTkZVPTyCZJRg@mail.gmail.com> <CAAK044R5eA622EMPFu2p1hmA_tDHrYdCa5S+r6OSWzCKcsQmSw@mail.gmail.com> <CAK6E8=dcDfawq7z9mDTDQS3PjKyjZibUxvEygqZYvgR6_AHCUA@mail.gmail.com> <CAAK044Rj=BQz__SAqjPUqyFP_Q3Td35LKfxzNRMgNsJX0ES-=Q@mail.gmail.com> <CADVnQykv3JNBWX3xkxdyDVpD+ru9i+aGFygtaL9rtdee0H6_8Q@mail.gmail.com> <CAK6E8=e40CBEj2fcTtYR-aLBxNL5+b2a0D4uDUzJX=-qYwBe=Q@mail.gmail.com> <CAAK044Rgu32KVsq4FzqS4dFjL-ZdCt=aeAy3oF9zbLfmPVG28g@mail.gmail.com>
In-Reply-To: <CAAK044Rgu32KVsq4FzqS4dFjL-ZdCt=aeAy3oF9zbLfmPVG28g@mail.gmail.com>
From: Neal Cardwell <ncardwell@google.com>
Date: Wed, 05 Jun 2024 14:56:51 -0400
Message-ID: <CADVnQymnGAiBeOTu0O_Bm-H0MpTJTcmWyJb_qRi8Lx8srBXNGA@mail.gmail.com>
To: Yoshifumi Nishida <nsd.ietf@gmail.com>
Content-Type: multipart/alternative; boundary="0000000000004aa803061a292255"
Message-ID-Hash: ZX2EYOT7ZE4BLMPQM5B2GK6GAVB2DQZU
X-Message-ID-Hash: ZX2EYOT7ZE4BLMPQM5B2GK6GAVB2DQZU
X-MailFrom: ncardwell@google.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tcpm.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: "tcpm@ietf.org Extensions" <tcpm@ietf.org>
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [tcpm] Re: using SACK info for RTTM?
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/D4ISSYFq0J3Z-5oG06LCk5QEiPs>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Owner: <mailto:tcpm-owner@ietf.org>
List-Post: <mailto:tcpm@ietf.org>
List-Subscribe: <mailto:tcpm-join@ietf.org>
List-Unsubscribe: <mailto:tcpm-leave@ietf.org>
On Wed, Jun 5, 2024 at 2:21 PM Yoshifumi Nishida <nsd.ietf@gmail.com> wrote: > Hi Yuchung, Neal, thank you so much. It’s interesting. > > So, it might be a good time to revive > https://datatracker.ietf.org/doc/draft-yang-tcpm-ets/ ? > Yes, reviving the work for some kind of standardization of more precise TCP timestamps is something we would like to do when time permits. > OTOH, I’m thinking why DSACK is not sufficient here. > DSACK can work sometimes, however it is not as good as timestamp undo, for at least a few reasons: (1) DSACK undo is slower: DSACK undo takes about two round trips longer than timestamp undo. With DSACK undo the data sender has one extra round trip to make a full set of spurious retransmissions for apparent holes in the sequence space, and then a second round trip to receive all the DSACKs for the spurious retransmissions. Only then can the data sender undo the congestion control response. With timestamp-based undo, usually a data sender will receive an ACK very shortly (say, O(1ms)) after its spurious retransmit that covers the spuriously retransmitted sequence and has a TS ECR from before the retransmit, allowing the undo to happen immediately. (2) DSACK undo is unreliable: if even a single ACK packet containing a DSACK is lost, then the data sender cannot undo the congestion control response. For example, if a flow is fully utilizing a 10Gbps * 100ms path, and thus (1500B MTU) its cwnd is at least 82,562, then the loss rate in the direction of returning ACKs needs to be zero in that round trip, or on average less than 1/82,562, or < 0.0012%. Not impossible, but a stringent requirement. With timestamp-based undo every incoming ACK has a TS ECR value that allows undo, so ACK loss is not a problem. neal > Thanks, > — > Yoshi > > On Wed, Jun 5, 2024 at 10:06 AM Yuchung Cheng <ycheng@google.com> wrote: > >> Also TCP timestamp needs to really move to usec level for today's >> data-center networks, which Eric Dumazet finally upstreamed that feature >> (to opt-in). anything beyond 10us can't be used in Eifel >> >> On Wed, Jun 5, 2024 at 6:41 AM Neal Cardwell <ncardwell@google.com> >> wrote: >> >>> IMHO by far the biggest benefit of TCP timestamps is not in RTT >>> measurement or PAWS, but in using them for "Eifel" undo (a la RFC 3522, RFC >>> 4015): quickly detecting spurious loss detection events due to reordering, >>> and quickly undoing the spurious congestion control slow-down response. >>> This is important since reordering is increasingly common due to many >>> increasingly common network mechanisms: link-layer retransmissions for >>> wifi/cellular links, traffic engineering, multipathing and ECMP/WCMP >>> load-balancing, protective load balancing (SIGCOMM 2022), protective >>> reroute (SIGCOMM 2023), multi-queue NICs, etc. Those factors make the 12 >>> bytes of TCP option space overwhelmingly worth it. >>> >>> best regards, >>> neal >>> >>> >>> On Wed, Jun 5, 2024 at 3:03 AM Yoshifumi Nishida <nsd.ietf@gmail.com> >>> wrote: >>> >>>> Hi Yuchung, >>>> >>>> Thanks for the explanation. >>>> I thought a bit about the trade-off between using 12 bytes options >>>> space and giving up measuring RTTs for retransmitted packets. >>>> But, I am included to prefer measuring RTTs for now. >>>> >>>> -- >>>> Yoshi >>>> >>>> On Mon, Jun 3, 2024 at 1:57 PM Yuchung Cheng <ycheng@google.com> wrote: >>>> >>>>> hi Yoshifumi, >>>>> >>>>> Linux only uses TS-opts if needed to disambiguate on RTT samples >>>>> covering sequences that have been retransmitted. This applies to SACK or >>>>> non-SACK. In order words, if an S/ACK covers a sequence range that has >>>>> never been retransmitted, Linux does not use timestamp options. >>>>> >>>>> On Mon, Jun 3, 2024 at 1:29 PM Yoshifumi Nishida <nsd.ietf@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Neal, thank you so much for the comments. >>>>>> >>>>>> The linux algorithm you've described makes sense to me and it seems >>>>>> the scheme doesn't require timestamp options. >>>>>> However, as far as I've read linux code, it seems that linux still >>>>>> uses timestamp options for RTT measurement to some extent. >>>>>> I'm curious why linux is mixing two schemes for RTTM. >>>>>> -- >>>>>> Yoshi >>>>>> >>>>>> On Mon, Jun 3, 2024 at 8:57 AM Neal Cardwell <ncardwell@google.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Jun 3, 2024 at 11:02 AM Yoshifumi Nishida < >>>>>>> nsd.ietf@gmail.com> wrote: >>>>>>> >>>>>>>> Hi folks, >>>>>>>> >>>>>>>> While I was checking RFC7323, I found the following sentence. >>>>>>>> >>>>>>>> RTTM update processing explicitly excludes segments not updating >>>>>>>> SND.UNA. The original text could be interpreted to allow taking >>>>>>>> RTT samples when SACK acknowledges some new, non-continuous >>>>>>>> data. >>>>>>>> >>>>>>>> I am a bit curious about the rationale of this sentence. >>>>>>>> It seems to me that we cannot measure RTT when we have a gap in packet sequence with this rule. >>>>>>>> >>>>>>>> >>>>>>> Yes, that rule forbids using RFC7323 timestamps for calculating RTT >>>>>>> samples for SACKed sequence ranges. >>>>>>> >>>>>>> The rationale: AFAIK this rule is a necessary consequence of the >>>>>>> conditions under which TS.Recent is updated. >>>>>>> >>>>>>> The rules for updating TS.Recent are in sec 4.3, "Which Timestamp to >>>>>>> Echo": >>>>>>> https://datatracker.ietf.org/doc/html/rfc7323#section-4.3 >>>>>>> Rule (2) in sec 4.3 says: >>>>>>> If: >>>>>>> SEG.TSval >= TS.Recent and SEG.SEQ <= Last.ACK.sent >>>>>>> then SEG.TSval is copied to TS.Recent; otherwise, it is ignored. >>>>>>> >>>>>>> Since out-of-order sequence ranges that are SACKed will fail the >>>>>>> SEG.SEQ <= Last.ACK.sent check, SACKed sequence ranges will not update >>>>>>> TS.Recent. So using TS.Recent to calculate an RTT sample for a SACKed >>>>>>> sequence range could, in general, give a vastly overestimated RTT sample. >>>>>>> So that's why it's forbidden by the RFC. >>>>>>> >>>>>>> However, in practice usually this does not need to be a big deal. >>>>>>> For example, Linux TCP still obtains an RTT sample for every >>>>>>> non-retransmitted SACKed sequence range, by: >>>>>>> >>>>>>> (a) recording the transmit time of every sequence range >>>>>>> (b) recording whether that sequence range was retransmitted, and then >>>>>>> (c) using those two pieces of information when that sequence range >>>>>>> is cumulatively or selectively ACKed, to calculate an RTT sample >>>>>>> (rtt_sample = now - transmit_timestamp) if the sequence range was never >>>>>>> retransmitted. >>>>>>> >>>>>>> So, in Linux TCP, SACKed sequence ranges fail to generate an RTT >>>>>>> sample only when they were previously retransmitted. >>>>>>> >>>>>>> best regards, >>>>>>> neal >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> -- >>>>>>>> Yoshi >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> tcpm mailing list -- tcpm@ietf.org >>>>>>>> To unsubscribe send an email to tcpm-leave@ietf.org >>>>>>>> >>>>>>> _______________________________________________ >>>>>> tcpm mailing list -- tcpm@ietf.org >>>>>> To unsubscribe send an email to tcpm-leave@ietf.org >>>>>> >>>>>
- [tcpm] using SACK info for RTTM? Yoshifumi Nishida
- [tcpm] Re: using SACK info for RTTM? Neal Cardwell
- [tcpm] Re: using SACK info for RTTM? Yoshifumi Nishida
- [tcpm] Re: using SACK info for RTTM? Yuchung Cheng
- [tcpm] Re: using SACK info for RTTM? Yoshifumi Nishida
- [tcpm] Re: using SACK info for RTTM? Neal Cardwell
- [tcpm] Re: using SACK info for RTTM? Yuchung Cheng
- [tcpm] Re: using SACK info for RTTM? Yoshifumi Nishida
- [tcpm] Re: using SACK info for RTTM? Neal Cardwell
- [tcpm] Re: using SACK info for RTTM? Yuchung Cheng
- [tcpm] Re: using SACK info for RTTM? rs.ietf