Re: [iccrg] New draft submitted for draft-pan-tsvwg-hpccplus-02.txt

Yuchung Cheng <ycheng@google.com> Wed, 16 December 2020 04:44 UTC

Return-Path: <ycheng@google.com>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D7DEF3A0EB9 for <iccrg@ietfa.amsl.com>; Tue, 15 Dec 2020 20:44:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.499
X-Spam-Level:
X-Spam-Status: No, score=-17.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Mk8cE-mcciXe for <iccrg@ietfa.amsl.com>; Tue, 15 Dec 2020 20:44:52 -0800 (PST)
Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8F5433A0EB5 for <iccrg@irtf.org>; Tue, 15 Dec 2020 20:44:52 -0800 (PST)
Received: by mail-wm1-x32c.google.com with SMTP id x22so1107662wmc.5 for <iccrg@irtf.org>; Tue, 15 Dec 2020 20:44:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Msz8xjQqbY5NBeE/rrPaG00xd4Ks4iVlyjtXswp/++U=; b=Yrz6hqiZkjY9t+jS9uQyvEFmcqD+Cb2jsgs1MpfsGN4y50hpXhKkTjiAHdTzcBUY5r Ie/K3cGSx/+UiYJW3hejHhJF4Lb122otRLCsDhsdhPuCdNTzlwm7dR2z8okURYa5AX+6 NxbEfLNEzB5olIdhII1YbfvGnWde4QHHGhHxofIHlVqF2UtG2mnJE48bJHV3/etW5fDl 43TuSJqFdrXNjnTd2jFA51s8OxLf5l/6zBFYIJcsOKY4JOy8cPMNyg+fPBYhCh3OGeup oSpTYyuOvwLp90QK5rDeZsUYG4GDxsC6xAsWOkqh8oJT0OPVyy93l9U0dBQmW3FUMCM1 JL2w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Msz8xjQqbY5NBeE/rrPaG00xd4Ks4iVlyjtXswp/++U=; b=nJqQX5yK3xaoZsBW1kWzITrFMxS0C4nY4QW0KPB0Ual+VKgPcBbOP0oMr+w8iEznFw df0a+nx9y+m6yo5xUbI3h/ZAWDNja9fzW7VVa6QVdyZEKFKtU1+IiSI8CGKZROy/29Px J2c4LaNZ/oHq7My5pNwUeG7MxgkEtl2OmcnyL8dueXIaPgZOHuQGDx2lYq+r4Cfn2vNj o11JCuHmu7JwmvrnmSCVVmVzrhKMzxs0rM0GzCEqmSFixOx4iE8XEo79HX7/jnI/Cpee vIGL4hi360WwShcE3tLhglk0w7sVtlr029CEQ/OMIAN5az6OSeC3WM1gwDbsnt7+h1Jw Aoiw==
X-Gm-Message-State: AOAM530qsNEp4qSS9iRC7nGobIR1AP9abuiHshbF0c5suVXnBYQajX1G co1wO9iQ13uY8U4XhdRFMW99Ey8OX6RLCvWmL2Ld2Q==
X-Google-Smtp-Source: ABdhPJxOsrPB0dCW0gKWL2syJTp1v2JnedKG98xgyounx+KMGopsJ4tRI3SAfOmgDTQxcY5RG05hMyodaJLROJcvGVM=
X-Received: by 2002:a1c:6856:: with SMTP id d83mr1460694wmc.119.1608093890409; Tue, 15 Dec 2020 20:44:50 -0800 (PST)
MIME-Version: 1.0
References: <3b396b85-d412-4e52-8716-52eac2a814e8.miao.rui@alibaba-inc.com> <CAK6E8=cwSk0n-MyYFZoCg4=q_3pZ=4zvP+q+1jB6+YCpm4LmRw@mail.gmail.com> <BYAPR12MB344609B1165B4D8868198649B4C60@BYAPR12MB3446.namprd12.prod.outlook.com> <CAK6E8=cwwxg4zkHD2EJcCki5ydmhZEg8r5N=Jf6twH9OZ0qwJg@mail.gmail.com> <53B9DCFB-D425-49E3-8F42-D35F2F887224@intel.com>
In-Reply-To: <53B9DCFB-D425-49E3-8F42-D35F2F887224@intel.com>
From: Yuchung Cheng <ycheng@google.com>
Date: Tue, 15 Dec 2020 20:44:13 -0800
Message-ID: <CAK6E8=fZhM9bFBXGpgUKj67ksCc3O-VfSuvUeWgafTCdPYSHmA@mail.gmail.com>
To: "Pan, Rong" <rong.pan@intel.com>
Cc: Barak Gafni <gbarak@nvidia.com>, NBU-Contact-Rui Miao <miao.rui@alibaba-inc.com>, iccrg <iccrg@irtf.org>, NBU-Contact-Harry Liu <hongqiang.liu@alibaba-inc.com>, "jri.ietf" <jri.ietf@gmail.com>, "Lee, Jeongkeun" <jk.lee@intel.com>, Barak Gafni <gbarak@mellanox.com>, Yuval Shpigelman <yuvals@mellanox.com>
Content-Type: multipart/alternative; boundary="0000000000001ad5aa05b68d892f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/-65FIhX1cIxj7N-K3SSwREtX_Pk>
Subject: Re: [iccrg] New draft submitted for draft-pan-tsvwg-hpccplus-02.txt
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Dec 2020 04:44:55 -0000

Hi Rong,

I actually think ingress or egress won't matter as much as qlen vs sojourn
time based telemetry. I have not read the hpcc paper yet but I am curious
its performance with multiple QoS being present -- apologize I have not
read the paper yet:

Let's say a port has two queues for high (H) and low (L) priority packets
separately.

A low-prio packet arrives at (an empty) L. It waits a long time because H
is extremely busy. So the qlen either ingress or egress metered is 0 for
this packet, but its sojourn or actual queuing time is large. The degree
depends on the switch's priority scheduling policy of course. so the packet
indeed experiences some congestion but not visible with qlen-INT metric.


On Tue, Dec 15, 2020 at 6:21 PM Pan, Rong <rong.pan@intel.com> wrote:

> Yuchung,
>
>
>
> Thanks for reviewing the draft. Good point about where/how to measure the
> queue length.  We will look into adding a paragraph to describe the
> difference between ingress or egress-based qlen or whether they can be
> mixed. From your experience, what issues do you think we need to look out
> for?
>
>
>
> Best,
>
>
>
> Rong
>
>
>
> *From: *Yuchung Cheng <ycheng@google.com>
> *Date: *Tuesday, December 15, 2020 at 3:17 PM
> *To: *Barak Gafni <gbarak@nvidia.com>
> *Cc: *NBU-Contact-Rui Miao <miao.rui@alibaba-inc.com>, iccrg <
> iccrg@irtf.org>, "Pan, Rong" <rong.pan@intel.com>, NBU-Contact-Harry Liu <
> hongqiang.liu@alibaba-inc.com>, "jri.ietf" <jri.ietf@gmail.com>, "Lee,
> Jeongkeun" <jk.lee@intel.com>, Barak Gafni <gbarak@mellanox.com>, Yuval
> Shpigelman <yuvals@mellanox.com>
> *Subject: *Re: [iccrg] New draft submitted for
> draft-pan-tsvwg-hpccplus-02.txt
>
>
>
>
>
>
>
> On Tue, Dec 15, 2020 at 3:03 PM Barak Gafni <gbarak@nvidia.com> wrote:
>
> Hi,
>
> Thanks for the interest in this work. For your question, at this point the
> draft has been kept open in terms of what is the exact inband telemetry
> technology to be used in order to implement the algorithm. The idea was to
> enable a variety of implementations. With that, one option we are focusing
> on is IOAM which is under work at IPPM WG, and has a data draft specifying
> formats for the communication of these metrics. Alongside this main data
> draft, there is a another draft in its initial work that adds few more
> fields, which may be used for HPCC++.
>
> You are welcome to look here:
>
> https://tools.ietf.org/html/draft-ietf-ippm-ioam-data-11
>
> https://tools.ietf.org/html/draft-gafni-ippm-ioam-additional-data-fields-00
>
> Actually the qlen is very specifically defined:)
>
> https://tools.ietf.org/html/draft-ietf-ippm-ioam-data-11#section-5.4.2.7
>
>
>
> I understand and agree with the intention to keep telemetry options more
> flexible (to get wider HW support). A paragraph explaining what are the key
> properties or requirements of these metrics to achieve a precise link load
> estimate would provide more guidance. For example the qlen defined in ippm
> draft is the "queue length at departure time". Will the algorithm work the
> same if qlen is metered at ingress (say some HW can't do egress for some
> reason). What if there are hybrid mix of different qlen measurements on the
> path.
>
>
>
>
>
>       includes link load (txBytes, qlen, ts) and link spec (switch_ID,
>
>       port_ID, B) at the egress port.  Note, each switch should record
>
>       all those information at the single snapshot to achieve a precise
>
>       link load estimate."
>
>
>
>
>
>
>
> Any further feedback is welcome.
>
>
>
> Thanks,
>
> Barak
>
>
>
> *From:* Yuchung Cheng <ycheng@google.com>
> *Sent:* Tuesday, December 15, 2020 2:35 PM
> *To:* NBU-Contact-Rui Miao <miao.rui@alibaba-inc.com>
> *Cc:* iccrg <iccrg@irtf.org>; Pan, Rong <rong.pan@intel.com>;
> NBU-Contact-Harry Liu <hongqiang.liu@alibaba-inc.com>; jri.ietf <
> jri.ietf@gmail.com>; Lee, Jeongkeun <jk.lee@intel.com>; Barak Gafni <
> gbarak@mellanox.com>; Yuval Shpigelman <yuvals@mellanox.com>
> *Subject:* Re: [iccrg] New draft submitted for
> draft-pan-tsvwg-hpccplus-02.txt
>
>
>
> Interesting work!
>
>
>
> It'd be good to know more precise requirements on INT to help both vendor
> supports (beside MLX) and CC evaluation
>
>
>
> For example
>
> qlen         | Telemetry info: link j queue length
>
>
>
> qlen == instant qlen snapshot at packet ingress or egress, on a per-port-per-queue basis, or some windowed-avg / aggregate etc.
>
>
>
>
>
> On Mon, Dec 14, 2020 at 4:11 PM Rui, Miao <miao.rui@alibaba-inc.com>
> wrote:
>
> Hello ICCRG members,
>
>
>
> Alibaba, Intel, and Mellanox have worked on an INT-based High Precision
> Congestion Control algorithm: HPCC++. We have posted an initial draft that
> can be found at
> https://www.ietf.org/id/draft-pan-tsvwg-hpccplus-02.txt
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ietf.org%2Fid%2Fdraft-pan-tsvwg-hpccplus-02.txt&data=04%7C01%7Cgbarak%40nvidia.com%7Ccaefd0944b4f4a025acd08d8a149dacf%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637436685862257347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=WGsD6vSg6JqJ6sNV4dqrE56yXzxmkIf%2FHnWSw742bUE%3D&reserved=0>
>
>
>
> The key design choice of HPCC++ is to use inband telemetry to provide
> fine-grained load information, such as queue size and accumulated tx
> traffic to compute precise flow rates. This has two major benefits:
>
> 1. HPCC++ can quickly converge to proper flow rates to highly utilize
> bandwidth while avoiding congestion;
>
> 2. HPCC++ can consistently maintain a close-to-zero queue for low latency.
>
>
>
> We would love to hear your comments and feedback.
>
> Best regards,
>
> Rui Miao
>
> _______________________________________________
> iccrg mailing list
> iccrg@irtf.org
> https://www.irtf.org/mailman/listinfo/iccrg
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.irtf.org%2Fmailman%2Flistinfo%2Ficcrg&data=04%7C01%7Cgbarak%40nvidia.com%7Ccaefd0944b4f4a025acd08d8a149dacf%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637436685862267343%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=R5%2F8N%2FGbsiVrdoWuLBx6TkXzD%2BdZ0EnU107gmLgiDqY%3D&reserved=0>
>
>