Re: [tcpm] ACK aggregation and congestion window growth

Neal Cardwell <ncardwell@google.com> Tue, 30 April 2019 19:39 UTC

Return-Path: <ncardwell@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5169712034E for <tcpm@ietfa.amsl.com>; Tue, 30 Apr 2019 12:39:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.5
X-Spam-Level:
X-Spam-Status: No, score=-17.5 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uKT8LdoVG7Y0 for <tcpm@ietfa.amsl.com>; Tue, 30 Apr 2019 12:39:13 -0700 (PDT)
Received: from mail-oi1-x243.google.com (mail-oi1-x243.google.com [IPv6:2607:f8b0:4864:20::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3F846120352 for <tcpm@ietf.org>; Tue, 30 Apr 2019 12:39:13 -0700 (PDT)
Received: by mail-oi1-x243.google.com with SMTP id t81so12243890oig.10 for <tcpm@ietf.org>; Tue, 30 Apr 2019 12:39:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7QasW59vHynfofoNAIoWC6dUvrmLXFP75ucTgrBx3s8=; b=bSm9AUHuIoPj3xDU3Ytryy9nVtqNBuQARhlKXPH8+BnK8ZjsxhyS3ftM/xtI2/2RiI roEwUVP3RDe8pzhHjDDsKHupZnPCJJJMRrFWVgwYoGS6yTIDXh1qoY9k9C4PswW4G2Of RhlJSN+Z17LzjQnM2irlJdcDG13sPj66dgBzLRgoRGP24dmsgE77iijX30PBTE9jiVwp rk+ChBtyrVVq51waaRchNvbegW50BiHKkSvPw72CNN+7whX+zIMn8YOt0rAQFuUGIlDv y8jS83MxgFz+d/9PrzMOnFFRg15U+FQ7fBgpUc+GhUuxcT6Up0oatEH7GRrpvl/kzL1Z 2j7g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7QasW59vHynfofoNAIoWC6dUvrmLXFP75ucTgrBx3s8=; b=nuMKtxrcTpmp3AIA+9uhG/MdrrdLc4Hm4wVCBLCMyNaVpzR8yUfuOQm6/BHhUA1g4K 7pEgoL19e4Xok/oLmn3O5XFi9DdXfd3axJDMdPinLokaoTwJ7xzVdSJEfXy9q1gY/13X V4RcmyIcBMxkwgPwHqvCwoVlm2zh92wXoBtM4kFI1sqaL/boiKyCU0Y3RhX7kyHPNWFL 368B87dqWmgM3vd1KWMzNHXLvNybl5yepHYL69l99vcn8qoryhHVKU0GwpRRYEUpZeW/ axhqR4CraU/hxb2VgafHtdmvEhCkByb/nfKEed/uaDclwbVNVfqRe+8cLqnCpOCTgoIF ONIA==
X-Gm-Message-State: APjAAAWkN9ggXjq+9RzE6dK6NJXsXesnPtCsdD7WPAbA3uoIR78/8x7L 5eeLMAUqUT7h8hndeEdRGKYv4/GA9CI0COcS12i3iw==
X-Google-Smtp-Source: APXvYqy42t74BJObC1P6zYTP0KjS3Ng1UQiSMa3w3MD3F5hnmkeKPNR8H9pKMMJyIL7kE9pwst03wAMuC4chITGsn8U=
X-Received: by 2002:aca:5c55:: with SMTP id q82mr4111549oib.95.1556653152182; Tue, 30 Apr 2019 12:39:12 -0700 (PDT)
MIME-Version: 1.0
References: <HE1PR07MB4425D7C321FC82CBACED76A5C23E0@HE1PR07MB4425.eurprd07.prod.outlook.com> <AF133C57-C2E0-4383-A7DD-9C4682E4869B@mac.com> <CADVnQy=LhFxYHiHQREzqgBZKX8y-g-EUvH=xjUBuCzeD3grWSw@mail.gmail.com> <HE1PR07MB44252B4D0228DEC6C67F36BCC23A0@HE1PR07MB4425.eurprd07.prod.outlook.com>
In-Reply-To: <HE1PR07MB44252B4D0228DEC6C67F36BCC23A0@HE1PR07MB4425.eurprd07.prod.outlook.com>
From: Neal Cardwell <ncardwell@google.com>
Date: Tue, 30 Apr 2019 15:38:54 -0400
Message-ID: <CADVnQym7qDBEffR71a04=MDjRf6HrsR8C=XiEqbU2FVRyPTWBA@mail.gmail.com>
To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
Cc: "tcpm@ietf.org" <tcpm@ietf.org>, Yuchung Cheng <ycheng@google.com>, Eric Dumazet <edumazet@google.com>, Rick Jones <perfgeek=40mac.com@dmarc.ietf.org>
Content-Type: multipart/alternative; boundary="0000000000002d37f60587c48eab"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/aWNuT08u5IkJOL3wXTf8GSQGN0c>
Subject: Re: [tcpm] ACK aggregation and congestion window growth
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Apr 2019 19:39:16 -0000

Hi Ingemar,

Thanks. We'll see if we can reproduce this and fix it based on our working
theory. We'll follow up on this thread if we end up posting a patch for
Linux TCP. If you do happen to collect a trace that shows this problem, we
would love to take a look.

thanks,
neal


On Tue, Apr 30, 2019 at 3:31 AM Ingemar Johansson S <
ingemar.s.johansson@ericsson.com> wrote:

> Hi
>
>
>
> Thanks for the hints. I will try and fix some wireshark logs, seems though
> that this issue is a bit transient and admittedly it works more often than
> it does not. This is perhaps a corner case ?
>
>
>
> /Ingemar
>
>
>
> *From:* Neal Cardwell <ncardwell@google.com>
> *Sent:* den 26 april 2019 16:27
> *To:* Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
> *Cc:* tcpm@ietf.org; Yuchung Cheng <ycheng@google.com>; Eric Dumazet <
> edumazet@google.com>; Rick Jones <perfgeek=40mac.com@dmarc.ietf.org>
> *Subject:* Re: [tcpm] ACK aggregation and congestion window growth
>
>
>
> Hi Ingemar,
>
>
>
> Thanks for the report. From the description of the issue, I suspect there
> is some combination of the following:
>
>
>
> (1) This may be related to some buggy behavior in the Linux TCP logic for
> assessing whether a connection is cwnd-limited or application-limited. In
> the scenario you describe, with those high speeds and long gaps between
> ACKs, it sounds like the flow may have cwnd that is unused due to to TSO
> deferral, and this may be causing the logic to erroneously mark the flow as
> not being cwnd-limited, which would prevent cwnd growth by CUBIC.
>
>
>
> (2) If there are entire flights of data that are ACKed with a single ACK,
> then the CUBIC code may assess the long gaps between transmits and ACKs as
> an idle period, and erroneously push its epoch_start forward to skip any
> cwnd growth that would have been slated for that period.
>
>
>
> (I would guess (1) is more likely, since the Reno emulation code path in
> CUBIC should not be affected by (2), so CUBIC should eventually grow cwnd
> in a Reno-style fashion whether or not it hits (2).)
>
>
>
> We can provide some proposed patches for those issues. Would you be able
> to apply the patches and test them in your workload?  If so, what exact
> kernel version would the patches need to be generated for?
>
>
>
> Also, would you be able to post (suitably anonymized, heads-only) tcpdump
> packet traces so that we can see what the exact scenario is? This would be
> particularly useful if you are unable to apply the patches to verify the
> fix.
>
>
>
> thanks,
>
> neal
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Apr 26, 2019 at 10:03 AM Rick Jones <perfgeek=
> 40mac.com@dmarc.ietf.org> wrote:
>
> Does your ACK aggregator also delay the ACKs of SYNchronize segments? If
> not, perhaps the congestion control sees the increase (?) in round trip
> time after connection establishment as a signal of congestion and behaves
> accordingly.
>
>
> On Apr 26, 2019, at 06:46, Ingemar Johansson S <
> ingemar.s.johansson@ericsson.com> wrote:
>
> Hi
>
>
>
> I am experimenting with a simple test setup with 2 Ubuntu 18.04 PCs (Cubic
> CC).
>
> Between these two I have a simple setup that aggregates ACKs so that they
> are forwarded to the server only every 10ms. The min RTT is 18ms.
>
> The problem I see is that even though I should reach 1Gbps throughput, I
> only get around 200Mbps.
>
> I would believe that the congestion window should eventually increase
> enough to handle the burstiness given by the ACK aggregation but it seems
> like this is not the case.
>
> Is there a limitation in the Linux stack that prevents congestion window
> growth when ACKs arrive in bursts like this ?
>
>
>
> /Ingemar
>
> ==================================
>
> Ingemar Johansson  M.Sc.
>
> Master Researcher
>
>
>
> Ericsson Research
>
> Network Protocols & E2E Performance
>
> Labratoriegränd 11
>
> 971 28, Luleå, Sweden
>
> Phone +46-1071 43042
>
> SMS/MMS +46-73 078 3289
>
> ingemar.s.johansson@ericsson.com
>
> www.ericsson.com
>
>
>
>                 Nothing to stop this
>
>              being the best day ever
>
>                             U2
>
> ==================================
>
>
>
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm
>
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm
>
>