Re: [ippm] [tsvwg] [iccrg] New Internet Draft: Congestion Signaling (CSIG)

Neal Cardwell <ncardwell@google.com> Wed, 13 September 2023 01:51 UTC

Return-Path: <ncardwell@google.com>
X-Original-To: ippm@ietfa.amsl.com
Delivered-To: ippm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A85BFC13AE3F for <ippm@ietfa.amsl.com>; Tue, 12 Sep 2023 18:51:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -22.607
X-Spam-Level:
X-Spam-Status: No, score=-22.607 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 19blf06oRnTR for <ippm@ietfa.amsl.com>; Tue, 12 Sep 2023 18:50:57 -0700 (PDT)
Received: from mail-vk1-xa33.google.com (mail-vk1-xa33.google.com [IPv6:2607:f8b0:4864:20::a33]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 02687C1516EB for <ippm@ietf.org>; Tue, 12 Sep 2023 18:50:56 -0700 (PDT)
Received: by mail-vk1-xa33.google.com with SMTP id 71dfb90a1353d-495c10cec8aso1829862e0c.1 for <ippm@ietf.org>; Tue, 12 Sep 2023 18:50:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694569856; x=1695174656; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=uUPSB6K20Naq6wz76bfaFGQF/gLRpINmYUf558VrGcI=; b=Ayuxc3gFdbOKIWUO+UabPMIjAlGy6KtVvc6cHUQtllo3c91uKsF4mvAUe4w7VTuOEu ptEChTExp3qnn7J+6nRckqvT0h58JXoK/0zrNQQzpWDPZMOup1ouWbUgQvRtFqjs0Q+C D9TMnyS27t+CyRvuGofkBQMQatpR09OOqi5qGdzjyWGHHxFp26Mh3ycNjjlHxgh6F/JV U/H6ehI2S6zkS4omvpOo08lMDn905bVjB4/6mF/qiN0J6pMH3mZ4D9etakGawX+0wyI0 t+Ta9XaYaqW0hMPOaSWCipe24LNFcvb6wBI7yB/kry6stT3OltLdRDQgQH6YTuGpNL6O EQDA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694569856; x=1695174656; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uUPSB6K20Naq6wz76bfaFGQF/gLRpINmYUf558VrGcI=; b=ENy4WMoYBVW/pih2/7HkCYRAbwEjai/3za0+7yVrZOyBQE8dvB6K9MiCkVsIi9ktei fakcLz79fVfrOfB2uXZGkJNFJqXd/ECtRnu8XPxt5eLS1dMl7DZ0nkm+8Cd87SClxqZD 7Pw8cZtEQuTDO8YtDElcFaOhEv0WH8gxDjVSPapFz+ibfS4xRzC9JoWovtuu5FvodJlG kgpKsT9INvvIy0Ajcw+8rOcKkhWAe4WRKpzMf5XWwbzrcL6u7uZvKtoOSq+xFRJLh1Eq lDYjy4gWtX2alSKOXFiYxS6VeTjVYcLLFHiCooW1Aap2Rl0yGS2vjzCOj+plq9x7BiJe DeYQ==
X-Gm-Message-State: AOJu0Yxch59RPrDZcTqvqykB2Fs6HURDB5FMS78YGnJmfWsLNXjOTlop RhtNqmoMSi2KFO7/plqjkhrJtLiYpj7nHHoHaJkwNQ==
X-Google-Smtp-Source: AGHT+IE1a6HUM20eug6D7K3A2gUH/3hHmgfkJpedWhevbKltVYjjxs3yEujqUdMJ1DnnX6yUhouZRZKminy6blcXp7w=
X-Received: by 2002:a1f:e043:0:b0:495:c10c:ec39 with SMTP id x64-20020a1fe043000000b00495c10cec39mr1459014vkg.2.1694569855561; Tue, 12 Sep 2023 18:50:55 -0700 (PDT)
MIME-Version: 1.0
References: <92a6a6b54105447db6998d15961b1f8e@huawei.com> <2cc3f954aa2447dcb475f2a630841859@huawei.com> <2F15B386-EFF2-4637-8A3D-AF3CDD61114D@apple.com>
In-Reply-To: <2F15B386-EFF2-4637-8A3D-AF3CDD61114D@apple.com>
From: Neal Cardwell <ncardwell@google.com>
Date: Tue, 12 Sep 2023 21:50:38 -0400
Message-ID: <CADVnQynjcK-eZFBj_RNnnNRm0MgA4rpvL-e9W5idHBe=VXjzHA@mail.gmail.com>
To: Vidhi Goel <vidhi_goel=40apple.com@dmarc.ietf.org>
Cc: "Shihang(Vincent)" <shihang9=40huawei.com@dmarc.ietf.org>, "Huangyihong (Rachel)" <rachel.huang=40huawei.com@dmarc.ietf.org>, Tom Herbert <tom=40herbertland.com@dmarc.ietf.org>, Abhiram Ravi <abhiramr=40google.com@dmarc.ietf.org>, IETF IPPM WG <ippm@ietf.org>, tsvwg <tsvwg@ietf.org>, "ccwg@ietf.org" <ccwg@ietf.org>, "iccrg@irtf.org" <iccrg@irtf.org>, Nandita Dukkipati <nanditad@google.com>, Naoshad Mehta <naoshad@google.com>, Jai Kumar <jai.kumar@broadcom.com>
Content-Type: multipart/alternative; boundary="0000000000004a1fb6060533ca38"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/Y-YjayBgbR5_sbu9VWY9L8itOmE>
X-Mailman-Approved-At: Wed, 13 Sep 2023 00:08:55 -0700
Subject: Re: [ippm] [tsvwg] [iccrg] New Internet Draft: Congestion Signaling (CSIG)
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Sep 2023 01:51:01 -0000

On Tue, Sep 12, 2023 at 7:59 PM Vidhi Goel <vidhi_goel=
40apple.com@dmarc.ietf.org> wrote:

> Not sure why we are coming up with so many new techniques when ECN just
> works fine.
> ECN is a 2 bit field (not 1 bit) and seems to be sufficient to indicate
> extent of congestion by marking it per packet. Adding more complexity to
> any layer whether it is L2 or L3 doesn’t work well in deployments. Our goal
> should be to simplify things and only add new headers if absolutely
> necessary.
>

ECN is great. But it's only Explicit Congestion Notification. It only
notifies you when there's congestion. When the congestion goes away, the
signal goes away, and flows don't know whether the bottleneck link is 0.1%
utilized or 99.9% utilized.

So with only ECN and a queue-free bottleneck, flows must either (a) be
willing to increase cwnd quickly and run the risk of overshooting and
causing massive delay/loss (as with CUBIC and its ceiling of 1.5x per round
trip cwnd growth), or (b) increase very slowly to avoid significant
overshoot (as with Reno and its 1 MSS per round trip cwnd growth).

With a signal like CSIG, which provides bandwidth utilization information,
then a CC can increase its sending rate as a function of the available
bandwidth, to quickly increase the sending rate only in cases where the
bottleneck has low utilization. This allows more quickly utilizing
underutilized links with lower risk of queuing and loss damage from
overshoot.

best regards,
neal




> Vidhi
>
> On Sep 12, 2023, at 3:12 AM, Shihang(Vincent) <shihang9=
> 40huawei.com@dmarc.ietf.org> wrote:
>
> Hi,
> I agree L2 may not be the best choice to carry the congestion signaling
> end-to-end and more bits are needed. We have submitted a draft to carry the
> multi-bits congestion signaling in L3. We call it Advanced ECN. See
> https://datatracker.ietf.org/doc/draft-shi-ccwg-advanced-ecn/.
>
> Thanks,
> Hang
>
> *From:* CCWG <ccwg-bounces@ietf.org> *On Behalf Of *Huangyihong (Rachel)
> *Sent:* Tuesday, September 12, 2023 5:41 PM
> *To:* Tom Herbert <tom=40herbertland.com@dmarc.ietf.org>; Abhiram Ravi
> <abhiramr=40google.com@dmarc.ietf.org>
> *Cc:* IETF IPPM WG <ippm@ietf.org>; tsvwg <tsvwg@ietf.org>; ccwg@ietf.org;
> iccrg@irtf.org; Nandita Dukkipati <nanditad@google.com>; Naoshad Mehta <
> naoshad@google.com>; Jai Kumar <jai.kumar@broadcom.com>
> *Subject:* Re: [CCWG] [iccrg] [tsvwg] New Internet Draft: Congestion
> Signaling (CSIG)
>
> Hi,
>
> I also have the same feeling. Implementing in L2 may be difficult to be
> used in e2e transport. Of course it can work well in limited domain, like
> DC or HPC clusters. However, I also look for some solutions that could be
> able to go through internet. We have submitted a draft to describe the
> transport challenges. See
> https://datatracker.ietf.org/doc/html/draft-huang-tsvwg-transport-challenges
> .
>
> I share the same opinion that the congestion signal is useful and current
> 1-bit ECN solution is not fully sufficient. But I also feel like the more
> straight way is to extend L3, or l4, like update IOAM, to carry the
> information. For L2 solution, it should be developed together with IEEE
> 802.1.
>
> BR,
> Rachel
>
> *发件人:* iccrg <iccrg-bounces@irtf.org> *代表 *Tom Herbert
> *发送时间:* 2023年9月10日 0:10
> *收件人:* Abhiram Ravi <abhiramr=40google.com@dmarc.ietf.org>
> *抄送:* IETF IPPM WG <ippm@ietf.org>; tsvwg <tsvwg@ietf.org>; ccwg@ietf.org;
>  iccrg@irtf.org; Nandita Dukkipati <nanditad@google.com>; Naoshad Mehta <
> naoshad@google.com>; Jai Kumar <jai.kumar@broadcom.com>
> *主题:* Re: [iccrg] [tsvwg] New Internet Draft: Congestion Signaling (CSIG)
>
> Hi, thanks for draft!
>
> The first thing that stands out to me is the carrier of the new packet
> headers. In the forward path it would be in L2 and in reflection it would
> be L4. As the draft describes, this would entail having to support the
> protocol in multiple L2 and multiple L4 protocols-- that's going to be a
> pretty big lift! Also, L2 is not really an end-to-end protocol (would
> legacy switches in the path also forward the header)l?).
>
> The signaling being described in the draft is network layer information,
> and hence IMO should be conveyed in network layer headers. That's is L3
> which conveniently is the average of L2+L4 :-)
>
> IMO, the proper carrier of the signal data is Hop-by-Hop Options. This is
> end-to-end and allows modification of data in-flight. The typical concern
> with Hop-by-Hop Options is high drop rates on the Internet, however in this
> case the protocol is explicitly confined to a limited domain so I don't see
> that as a blocking issue for this use case.
>
> The information being carried seems very similar to that of IOAM (IOAM
> uses Hop-by-Hop Options and supports reflection). I suppose the differences
> are that this protocol is meant to be consumed by the transport Layer and
> the data is a condensed summary of path characteristics. IOAM seems pretty
> extensible, so maybe it could be adapted to carry the signals of this draft?
>
> A related proposal might be FAST draft-herbert-fast. Where the CSIG is
> network to host signaling, FAST is host to network signaling for the
> purposes of requesting network services. These might be complementary and
> options for both may be in the same packet. FAST also uses reflection, so
> we might be able to leverage some common implementation at a destination.
>
> Tom
>
> On Fri, Sep 8, 2023, 7:43 PM Abhiram Ravi <abhiramr=
> 40google.com@dmarc.ietf.org> wrote:
>
> Hi IPPM folks,
>
> I am pleased to announce the publication of a new internet draft,
> Congestion Signaling (CSIG):
> https://datatracker.ietf.org/doc/draft-ravi-ippm-csig/
>
> CSIG is a new end-to-end packet header mechanism for in-band signaling
> that is simple, efficient, deployable, and grounded in concrete use cases
> of congestion control, traffic management, and network debuggability. We
> believe that CSIG is an important new protocol that builds on top of
> existing in-band network telemetry protocols.
>
> We encourage you to read the CSIG draft and provide your feedback and
> comments. We have also cc'd the TSVWG, CCWG, and ICCRG mailing lists, as we
> believe that this work may be of interest to their members as well.
>
> Thank you for your time and consideration.
>
> Sincerely,
> Abhiram Ravi
> On behalf of the CSIG authors
>
>
>