Re: [iccrg] New draft submitted for draft-pan-tsvwg-hpccplus-02.txt

Yuchung Cheng <ycheng@google.com> Tue, 15 December 2020 23:16 UTC

Return-Path: <ycheng@google.com>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C08D53A082F for <iccrg@ietfa.amsl.com>; Tue, 15 Dec 2020 15:16:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.499
X-Spam-Level:
X-Spam-Status: No, score=-17.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KT3zwLtEUkzf for <iccrg@ietfa.amsl.com>; Tue, 15 Dec 2020 15:16:58 -0800 (PST)
Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9B1673A0825 for <iccrg@irtf.org>; Tue, 15 Dec 2020 15:16:57 -0800 (PST)
Received: by mail-wr1-x42a.google.com with SMTP id m5so21414407wrx.9 for <iccrg@irtf.org>; Tue, 15 Dec 2020 15:16:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2wJP0CLVcZNpf0Xkp0tqkDGP7H7GUsclY/PZ99XmT+c=; b=VEuxuUQiVLlt3xQW5ewKrRJiB4l9U2sSV2JNlZrAwBWebuX2Em7ySfaYzNnfNv8HlP Odb20eoREatBzHyfosxVITVyJTPTmPHtG+R+qViaTvFsTT7ZGflU0ted7oqRJQrEDkpz WPu1DU+F4bVLzTk/GD0E5VdQVDkPhnSsc1+7ioK9hQUxDIXzo/pdvTPLvjtqjK7+b9Z1 mTfKDlktI7kqzrs9JfoBDFsmrrKNsNDoVblAFYn7TV9ef5IFk6+og8WYbrHj1lB7/V9Q s11LlY1HevpBfHSxILpbgs5704O7OevFUvJiTrOLHsKV6jxgtdWdz8qtT1T37ujTm50u UGrw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2wJP0CLVcZNpf0Xkp0tqkDGP7H7GUsclY/PZ99XmT+c=; b=uiAZUCUlukafzcYxBJ/LZwu3tzxy2slDZOUhOJrBiBfEK/ZOW84Lo1DnXuBrtibdxe MS89q4Eu/Ifcfwf/Ywdusg16U3zF0woh3sncHvc++UXmqS0hKdeoFpKUw9NGWre57XOA EaiYHwxqnZa06SeIqe+vHdEqUaQ8EEFtwlJQixJgMFsmwPc0VxON6J4ZtHY8mWVb/8G0 TtouOSBhHcGP+rOZUDmhHbwYEZ9qfaFp1BmREeuDfdkV/r5KR9+ku/f3yW/1ETIbNw54 AoGPhCHl0qUtetOGj3QGDp5Zzglh7a1nNhQK38AYPBeTI40YPqof+D40q6cvGt6jGfHy nzQg==
X-Gm-Message-State: AOAM532sPjBwDuSRbmNgNa4t9wPIDpT8F1MDXobXk3yJswpRmAo37K/w aTOJaUufJb/5bKBhTD2V4h8NED8kUx/c6gcxUAjMuw==
X-Google-Smtp-Source: ABdhPJz99G8/e295p8vb+yerMpashaN82Nv/FGXS8509selb1VHNH/V2VbZ80BW7YJNsZPxd/XFMYf3c14/7f2hW+Z0=
X-Received: by 2002:adf:ef06:: with SMTP id e6mr28980634wro.231.1608074215679; Tue, 15 Dec 2020 15:16:55 -0800 (PST)
MIME-Version: 1.0
References: <3b396b85-d412-4e52-8716-52eac2a814e8.miao.rui@alibaba-inc.com> <CAK6E8=cwSk0n-MyYFZoCg4=q_3pZ=4zvP+q+1jB6+YCpm4LmRw@mail.gmail.com> <BYAPR12MB344609B1165B4D8868198649B4C60@BYAPR12MB3446.namprd12.prod.outlook.com>
In-Reply-To: <BYAPR12MB344609B1165B4D8868198649B4C60@BYAPR12MB3446.namprd12.prod.outlook.com>
From: Yuchung Cheng <ycheng@google.com>
Date: Tue, 15 Dec 2020 15:16:18 -0800
Message-ID: <CAK6E8=cwwxg4zkHD2EJcCki5ydmhZEg8r5N=Jf6twH9OZ0qwJg@mail.gmail.com>
To: Barak Gafni <gbarak@nvidia.com>
Cc: NBU-Contact-Rui Miao <miao.rui@alibaba-inc.com>, iccrg <iccrg@irtf.org>, "Pan, Rong" <rong.pan@intel.com>, NBU-Contact-Harry Liu <hongqiang.liu@alibaba-inc.com>, "jri.ietf" <jri.ietf@gmail.com>, "Lee, Jeongkeun" <jk.lee@intel.com>, Barak Gafni <gbarak@mellanox.com>, Yuval Shpigelman <yuvals@mellanox.com>
Content-Type: multipart/alternative; boundary="0000000000006646d605b688f400"
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/7iC-uPIlxPNVHfBukTRj4hY2zvY>
Subject: Re: [iccrg] New draft submitted for draft-pan-tsvwg-hpccplus-02.txt
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Dec 2020 23:17:00 -0000

On Tue, Dec 15, 2020 at 3:03 PM Barak Gafni <gbarak@nvidia.com> wrote:

> Hi,
>
> Thanks for the interest in this work. For your question, at this point the
> draft has been kept open in terms of what is the exact inband telemetry
> technology to be used in order to implement the algorithm. The idea was to
> enable a variety of implementations. With that, one option we are focusing
> on is IOAM which is under work at IPPM WG, and has a data draft specifying
> formats for the communication of these metrics. Alongside this main data
> draft, there is a another draft in its initial work that adds few more
> fields, which may be used for HPCC++.
>
> You are welcome to look here:
>
> https://tools.ietf.org/html/draft-ietf-ippm-ioam-data-11
>
> https://tools.ietf.org/html/draft-gafni-ippm-ioam-additional-data-fields-00
>
Actually the qlen is very specifically defined:)
https://tools.ietf.org/html/draft-ietf-ippm-ioam-data-11#section-5.4.2.7

I understand and agree with the intention to keep telemetry options more
flexible (to get wider HW support). A paragraph explaining what are the key
properties or requirements of these metrics to achieve a precise link load
estimate would provide more guidance. For example the qlen defined in ippm
draft is the "queue length at departure time". Will the algorithm work the
same if qlen is metered at ingress (say some HW can't do egress for some
reason). What if there are hybrid mix of different qlen measurements on the
path.


      includes link load (txBytes, qlen, ts) and link spec (switch_ID,
      port_ID, B) at the egress port.  Note, each switch should record
      all those information at the single snapshot to achieve a precise
      link load estimate."




>
>
> Any further feedback is welcome.
>
>
>
> Thanks,
>
> Barak
>
>
>
> *From:* Yuchung Cheng <ycheng@google.com>
> *Sent:* Tuesday, December 15, 2020 2:35 PM
> *To:* NBU-Contact-Rui Miao <miao.rui@alibaba-inc.com>
> *Cc:* iccrg <iccrg@irtf.org>; Pan, Rong <rong.pan@intel.com>;
> NBU-Contact-Harry Liu <hongqiang.liu@alibaba-inc.com>; jri.ietf <
> jri.ietf@gmail.com>; Lee, Jeongkeun <jk.lee@intel.com>; Barak Gafni <
> gbarak@mellanox.com>; Yuval Shpigelman <yuvals@mellanox.com>
> *Subject:* Re: [iccrg] New draft submitted for
> draft-pan-tsvwg-hpccplus-02.txt
>
>
>
> Interesting work!
>
>
>
> It'd be good to know more precise requirements on INT to help both vendor
> supports (beside MLX) and CC evaluation
>
>
>
> For example
>
> qlen         | Telemetry info: link j queue length
>
>
>
> qlen == instant qlen snapshot at packet ingress or egress, on a per-port-per-queue basis, or some windowed-avg / aggregate etc.
>
>
>
>
>
> On Mon, Dec 14, 2020 at 4:11 PM Rui, Miao <miao.rui@alibaba-inc.com>
> wrote:
>
> Hello ICCRG members,
>
>
>
> Alibaba, Intel, and Mellanox have worked on an INT-based High Precision
> Congestion Control algorithm: HPCC++. We have posted an initial draft that
> can be found at
> https://www.ietf.org/id/draft-pan-tsvwg-hpccplus-02.txt
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ietf.org%2Fid%2Fdraft-pan-tsvwg-hpccplus-02.txt&data=04%7C01%7Cgbarak%40nvidia.com%7Ccaefd0944b4f4a025acd08d8a149dacf%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637436685862257347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=WGsD6vSg6JqJ6sNV4dqrE56yXzxmkIf%2FHnWSw742bUE%3D&reserved=0>
>
>
>
> The key design choice of HPCC++ is to use inband telemetry to provide
> fine-grained load information, such as queue size and accumulated tx
> traffic to compute precise flow rates. This has two major benefits:
>
> 1. HPCC++ can quickly converge to proper flow rates to highly utilize
> bandwidth while avoiding congestion;
>
> 2. HPCC++ can consistently maintain a close-to-zero queue for low latency.
>
>
>
> We would love to hear your comments and feedback.
>
> Best regards,
>
> Rui Miao
>
> _______________________________________________
> iccrg mailing list
> iccrg@irtf.org
> https://www.irtf.org/mailman/listinfo/iccrg
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.irtf.org%2Fmailman%2Flistinfo%2Ficcrg&data=04%7C01%7Cgbarak%40nvidia.com%7Ccaefd0944b4f4a025acd08d8a149dacf%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637436685862267343%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=R5%2F8N%2FGbsiVrdoWuLBx6TkXzD%2BdZ0EnU107gmLgiDqY%3D&reserved=0>
>
>