Re: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106

"Black, David" <David.Black@dell.com> Wed, 13 November 2019 00:40 UTC

Return-Path: <David.Black@dell.com>
X-Original-To: rdma-cc-interest@ietfa.amsl.com
Delivered-To: rdma-cc-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5FC491200C1 for <rdma-cc-interest@ietfa.amsl.com>; Tue, 12 Nov 2019 16:40:15 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.698
X-Spam-Level:
X-Spam-Status: No, score=-2.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=dell.com header.b=ZM/ARs8T; dkim=pass (1024-bit key) header.d=dell.onmicrosoft.com header.b=cMW7OG4A
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VEpHaEsNRhd9 for <rdma-cc-interest@ietfa.amsl.com>; Tue, 12 Nov 2019 16:40:11 -0800 (PST)
Received: from mx0a-00154904.pphosted.com (mx0a-00154904.pphosted.com [148.163.133.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 85955120091 for <rdma-cc-interest@ietf.org>; Tue, 12 Nov 2019 16:40:11 -0800 (PST)
Received: from pps.filterd (m0170393.ppops.net [127.0.0.1]) by mx0a-00154904.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id xACNUD5Y002474; Tue, 12 Nov 2019 19:40:04 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dell.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : mime-version; s=smtpout1; bh=EuzQPrJctKhFmAWuJAvJ291XivHvYqBCcYAzfEgJOYE=; b=ZM/ARs8TDJJ+2jVRuyLlUw7QUe/oRUKPzkciUa3upfb9oHML7RBkI+nPUgHwoNRnpfTF UW++Ua9320txranTLW9eS2dhFUy6ges1wWBGzys2CW2AhF1T2y7Ccj/cqaxK5OoM8zto JTm3wzs2PoeT4KAngf9/XkqwPEYajXhAbPtbGAX2dZL5yUXXdcYySsZ4MYkPemzKMgud QOFuoUsdejHFE0N6HMYLOsrpdLX5GQ7a9gfLGmpMVLluCPhj9ZB7CG0kb7HzZutvCFy4 a2HbXU8TNHqZkrHavkiujvMxExY/9fEFrq8qUxUeEkBDK1Segdurq7ocmffG1sdz/19L wg==
Received: from mx0b-00154901.pphosted.com (mx0b-00154901.pphosted.com [67.231.157.37]) by mx0a-00154904.pphosted.com with ESMTP id 2w7pqc4dsy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 12 Nov 2019 19:40:03 -0500
Received: from pps.filterd (m0144104.ppops.net [127.0.0.1]) by mx0b-00154901.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id xACNXNbM006870; Tue, 12 Nov 2019 19:40:02 -0500
Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-00154901.pphosted.com with ESMTP id 2w7qa7fusp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 12 Nov 2019 19:40:02 -0500
Received: from m0144104.ppops.net (m0144104.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id xAD0dUNk108284; Tue, 12 Nov 2019 19:40:02 -0500
Received: from nam01-sn1-obe.outbound.protection.outlook.com (mail-sn1nam01lp2058.outbound.protection.outlook.com [104.47.32.58]) by mx0b-00154901.pphosted.com with ESMTP id 2w7qa7fusf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=OK); Tue, 12 Nov 2019 19:40:02 -0500
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=G65mc9Oitwtj0E0EWp3rDHDfjGZU3JZw5nUvcL9ccs0Z2RBIi+vG9Af28QXG/Fbexusgb6mRIM5iSebQixFaiBpl5WU50PE+0WEz3b4ME1zmDVN/OwINV0B5Xxj0YYxa74zJj9R3vTAdsTLEn87er4I7zILvhgM5sXMMnLPyvcudj/UofXgfahz2pIfw2SuxllSpcBizaetZxroLB9j+Q2vLuilKMNttr6I58EAgVitWafc3nyk4Gza8FaXBRZxn3GDNDWEjfqNloanOTGuE8JOgNjwPK2Q82zA1BGgKTGyHmicjMXsko56eLJXcRIHlpG9ppoD/iaeBFkemDuerlA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EuzQPrJctKhFmAWuJAvJ291XivHvYqBCcYAzfEgJOYE=; b=akuWXVDW64GqpsbNqpCV6X1g6lcXkRXefkVyr7BVOHzMyYgh0GReqJNnk2cdALBtPlvWdzpVloK+nLsRnk3/y7WGbZ0W5seJRT3iqUd0SlrUiD+QQTQpa9kC2gg0SZWdhBc5g8ik5B3KmjWTpxiaXtAUfDi1sGAVB3scFQ97/Us5collIEVjp7oqSFjj2A3pdSPYG94hM+IxIiYJgTvRs6MfXrsUnKUyoIfL5abHZLIF1iZyhUvCsVRtax2DE9PrbNYj4A3WmvoD3mkDAZJCXBEj7F/wl7TVfeatPpCF8RZNu71lHGNYev8455Y4kkMhE3bcUYlkEqvkYt5wDDMbxA==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=dell.com; dmarc=pass action=none header.from=dell.com; dkim=pass header.d=dell.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Dell.onmicrosoft.com; s=selector1-Dell-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EuzQPrJctKhFmAWuJAvJ291XivHvYqBCcYAzfEgJOYE=; b=cMW7OG4AnoF7XtOMlsfAqQWE+QX8b6hpNFlbfIL5Uab5M7f1dhcTcMKi/evbhZBs86hHsoDpaeabncqKU94K9LzQK/MpVB+BAXfVrn1ptPZxB6T8WZCXES9qvh4jSuiLFR9bNH29jd1S4b5ZCXL3ie5tPYXLuOusTzI7xQGRpgU=
Received: from MN2PR19MB4045.namprd19.prod.outlook.com (10.186.145.137) by MN2PR19MB3696.namprd19.prod.outlook.com (52.135.39.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2430.25; Wed, 13 Nov 2019 00:39:59 +0000
Received: from MN2PR19MB4045.namprd19.prod.outlook.com ([fe80::8893:d435:ce32:3594]) by MN2PR19MB4045.namprd19.prod.outlook.com ([fe80::8893:d435:ce32:3594%6]) with mapi id 15.20.2430.027; Wed, 13 Nov 2019 00:39:59 +0000
From: "Black, David" <David.Black@dell.com>
To: "Zhuangyan (Yan)" <zhuangyan.zhuang@huawei.com>, Lars Eggert <lars@eggert.org>, Paul Congdon <paul.congdon@tallac.com>
CC: "rdma-cc-interest@ietf.org" <rdma-cc-interest@ietf.org>, "Black, David" <David.Black@dell.com>
Thread-Topic: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106
Thread-Index: AQHVmbDfvRgfDVjBrkO6ahCRXZA7yaeIOSpA
Date: Wed, 13 Nov 2019 00:39:59 +0000
Message-ID: <MN2PR19MB4045F5565E6F210A0130FB8583760@MN2PR19MB4045.namprd19.prod.outlook.com>
References: <CAAMqZPu6g56PotHQJcn6vvoex3=EPomCTgrmMm8jo3ozehG-WQ@mail.gmail.com>, <1605A4E1-7C7C-4BBD-BE35-960730A678D0@eggert.org> <326e89210c104ec6856152d9a76553fb@huawei.com>
In-Reply-To: <326e89210c104ec6856152d9a76553fb@huawei.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
msip_labels: MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Enabled=True; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_SiteId=945c199a-83a2-4e80-9f8c-5a91be5752dd; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Owner=david.black@emc.com; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_SetDate=2019-11-13T00:36:26.9631463Z; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Name=External Public; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Application=Microsoft Azure Information Protection; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Extended_MSFT_Method=Manual; aiplabel=External Public
x-originating-ip: [66.170.99.95]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 9df7e8c0-e935-40c3-cf29-08d767d2039f
x-ms-traffictypediagnostic: MN2PR19MB3696:
x-ms-exchange-purlcount: 3
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <MN2PR19MB3696FB9440951126E8B1D16583760@MN2PR19MB3696.namprd19.prod.outlook.com>
x-exotenant: 2khUwGVqB6N9v58KS13ncyUmMJd8q4
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 0220D4B98D
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(376002)(366004)(346002)(39860400002)(136003)(396003)(54094003)(199004)(189003)(7502003)(966005)(4326008)(11346002)(71190400001)(71200400001)(229853002)(86362001)(7696005)(66066001)(5660300002)(14454004)(2906002)(76176011)(107886003)(446003)(99286004)(561944003)(6246003)(478600001)(25786009)(66446008)(110136005)(64756008)(66556008)(6116002)(9686003)(786003)(81166006)(224303003)(186003)(81156014)(4001150100001)(74316002)(66476007)(790700001)(6506007)(66574012)(76116006)(6306002)(66946007)(3846002)(8936002)(26005)(606006)(486006)(33656002)(6436002)(236005)(7736002)(54896002)(55016002)(53546011)(30864003)(476003)(14444005)(256004)(102836004)(54906003)(52536014)(316002); DIR:OUT; SFP:1101; SCL:1; SRVR:MN2PR19MB3696; H:MN2PR19MB4045.namprd19.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1;
received-spf: None (protection.outlook.com: dell.com does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: DzLUdqS0JrLwh7Bi57dCCE5JpIb4+RzgpWFBTJvLObhvMYU7tYsShE8Oy9FKKe8OFGhSrSJwoD/X0Osu0rI87jl9WjVnEV49XuDi5CcOFVWQ7jxYUv7C7WzbrzqRTLDZH09djvhggvcaaZhGxzF4Qi1PiOUXGQzhoC7q3jsir/SGNd7uOKYGvlrc++u9YctwWco9NmZmVMp3+SJqnHATyi7LH0M/s0wSlqLg+tWzJvmHl4V45wVHDn2fhGJWICU9YwpaD5Axp5xH0JOgqtW13EAOdycMCLku/MZc9j87PEaufH46rQvpj5fNLM7LLTpqa5LvC7eVaj8+A8dKcHevPtRMQtvSxLRwmkFERukeyj22vhgIUUMuB64dsqVUtXyHOGtZVWUXW1uHf/o0Yvz3V/fYSovC8r32bMB1ZJWMQjZQ0N/Nb/uEspvoV2IhI1zTXJBGWqEgGMkVajHo0ppT3/HtYJr+fFq9gNp7ZqSJ5FI=
Content-Type: multipart/alternative; boundary="_000_MN2PR19MB4045F5565E6F210A0130FB8583760MN2PR19MB4045namp_"
MIME-Version: 1.0
X-OriginatorOrg: Dell.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 9df7e8c0-e935-40c3-cf29-08d767d2039f
X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Nov 2019 00:39:59.6040 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 945c199a-83a2-4e80-9f8c-5a91be5752dd
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: i4fsPNpT2jvEB4nQFL+OCejTvB9WJGc71Ucw4OT8e06R0rqXe8dDD68k5ojapg4S8olUjmLxWt35f/qOhB3H6g==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR19MB3696
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.95,18.0.572 definitions=2019-11-12_09:2019-11-11,2019-11-12 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 mlxscore=0 bulkscore=0 impostorscore=0 phishscore=0 lowpriorityscore=0 mlxlogscore=999 malwarescore=0 clxscore=1011 adultscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1910280000 definitions=main-1911120201
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 phishscore=0 bulkscore=0 mlxscore=0 suspectscore=0 adultscore=0 impostorscore=0 mlxlogscore=999 priorityscore=1501 spamscore=0 malwarescore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1910280000 definitions=main-1911120201
Archived-At: <https://mailarchive.ietf.org/arch/msg/rdma-cc-interest/fvthL7a8x3BGe7QMx_hiOX5Scts>
Subject: Re: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106
X-BeenThere: rdma-cc-interest@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Congestion Control for Large Scale HPC/RDMA Data Centers <rdma-cc-interest.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rdma-cc-interest>, <mailto:rdma-cc-interest-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rdma-cc-interest/>
List-Post: <mailto:rdma-cc-interest@ietf.org>
List-Help: <mailto:rdma-cc-interest-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rdma-cc-interest>, <mailto:rdma-cc-interest-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Nov 2019 00:40:15 -0000

A few follow-up comments.

>>> 2.1 AI ECN
>>>
>>> Discuss feedback on https://datatracker.ietf.org/doc/draft-zhuang-tsvwg-ai-ecn-for-dcn/.
>> I don't see a proposal here, I don't even see a concrete problem statement? This is yet another "let's throw AI at it" three-pager.

[… snip …]

> The problem is stated in background section, however it might not be so obvious… “As stated in [RFC7567], with proper parameters,
> RED can be an effective algorithm.  However, dynamically predicting the set of parameters (minimum threshold and maximum threshold)
> is difficult. ” Dynamic configuration of threshold is a problem for network configuration somehow…

That sounds like a straw-person demolition exercise, as RFC 7567 also states:

   This memo also explicitly obsoletes the recommendation that Random
   Early Detection (RED) be used as the default AQM mechanism for the
   Internet.  This is replaced by a detailed set of recommendations for
   selecting an appropriate AQM algorithm.

The upshot is that RED is not a good AQM algorithm to use as a comparison baseline - I hope that the results to be presented will make comparisons to more recent AQM algorithms, e.g., FQ-CoDel [RFC8290].

>>> 2.3 Mixing RDMA and TCP traffic
>>>
>>> These two traffic types with their differing congestion controllers are known to not play well with one another in the same traffic class.

Using protocol names to denote congestion control classes does not work well, even though it’s common (and I’ve done it myself).

We are dealing with two clases of congestion controls.  For lack of better terms the following class names are based on what the transport protocol throughput is proportional to where ‘p’ is the loss and/or congestion marking probability:
               - 1/sqrt(p)-class congestion controls: Includes most existing TCP congestion control algorithms, e.g., NewReno, CUBIC.
               - 1/p-class congestion controls: Includes DCTCP congestion control.
Keep in mind that p is a probability that is usually << 1 when expresed as a decimal, e.g., p=0.01 represents a 1% loss/marking rate.
>> When you say RDMA, you mean RoCE? Separate RoCE into a slice and move on. It's pointless to try and optimize for coexistence with a protocol that can change willy-nilly.
> [Y] yes, it means RoCE. If the network does not differentiate ROCE and TCP traffics, then they would compete anyhow…L4S might work on a similar work on classic TCP vs. DCTCP.

DCTCP would be a better protocol to focus on than RoCE, as both DCTCP congestion control and the DCQCN congestion control commonly used for RoCE are 1/p-class congestion controls.

TSVWG will be discussing L4S and SCE next week – both of those proposals are intended to enable coexistence of 1/sqrt(p)-class and 1/p-class congestion controls, among other goals.

Thanks, --David

From: Rdma-cc-interest <rdma-cc-interest-bounces@ietf.org> On Behalf Of Zhuangyan (Yan)
Sent: Tuesday, November 12, 2019 6:28 PM
To: Lars Eggert; Paul Congdon
Cc: rdma-cc-interest@ietf.org
Subject: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106


[EXTERNAL EMAIL]

Hi Lars,



Thank you for the review and detailed comments. Some responses inline at [Y].



Best Regards,



Yan

________________________________
发件人: Rdma-cc-interest <rdma-cc-interest-bounces@ietf.org<mailto:rdma-cc-interest-bounces@ietf.org>> 代表 Lars Eggert <lars@eggert.org<mailto:lars@eggert.org>>
发送时间: 2019年11月12日 19:35
收件人: Paul Congdon
抄送: rdma-cc-interest@ietf.org<mailto:rdma-cc-interest@ietf.org>
主题: Re: [Rdma-cc-interest] Side meeting plans at IETF-106

Hi,

On 2019-11-11, at 21:11, Paul Congdon <paul.congdon@tallac.com<mailto:paul.congdon@tallac.com>> wrote:
> Tuesday, November 19
> 8:30AM - 9:45AM
> Room: VIP A

I'll try and make the meeting, modulo NomCom duties. (Please all, send your feedback about candidates to NomCom now!)

Because I am not fully sure I can make this, some written feedback on the various agenda items:

> 1. How NICs can be designed for better CC in the HPC/RDMA/AI DCN
>
> Discuss feedback on a draft under development on OpenCC: https://datatracker.ietf.org/doc/draft-zhh-tsvwg-open-architecture/; a framework for flexible establishment of congestion control algorithms implemented by NICs and the network.  The expectation is there will be some experiment results.  The goal is to discuss the ideas with stakeholders (customers, NIC vendors, switch vendors) and explore what could/should be standardized.

There seem to be three things here:

A. A modular NIC offload interface. Unclear if the IETF is the right home for this? Also, see "Restructuring Endpoint Congestion Control" (Narayan et al.) for another direction.
[Y] yes, papers from Sigcomm (or related conferences) have discussed about decoupling congestion controls on NICs a bit, like CCP, AD/DC TCP et al., which we think it might be good for a discussion in the industry for a modular cc design : ), however we might be more focused on the interfaces between layers while the implementation of modules would be open to vendors. It could be discussed further.

B. Mixing and matching different CCs in one network over time. Given that at datacenter latencies, you really want to prevent even small-scale hiccups due to interactions between different CCs, I wonder if it would be sufficient to slice the NIC and have different slices where all traffic is handled by one CC? Seems more tractable.
[Y] Slicing the nic rather than different cc algorithms would also be an option. We also plan on experiments of different configurations of one cc for different traffics which might be in later version. Based on results in this version, 2 cc algorithms performs better than one cc for two types of traffics. It might be due to different feedbacks/reacts in the network.

C. The architectural idea to move away from in-band CC signaling from the network to the endpoints. There isn't much in the document that motivates this, and nothing about potential issues (e.g., loss of fate-sharing).
[Y] The architecture is intended to support both signaling directly from network (rather from receivers--is that what you mean in-band CC signaling?) and signaling from receivers (current practice). Each CC chooses their own ways of signaling and sure it would affect each other somehow if different CCs are used together while their signaling is different. More details will be discussed in later version and inputs/issues that we missed are greatly welcome.

A comment on this document that is really about this entire effort: we should just give up on RoCE. Mellanox has no interest in opening it, and I am therefore unwilling to spend cycles thinking about it.
[Y] :) Actually, the architecture is not binding to RoCE. The thought is to support several transport protocols including TCP. Current experiment results of different ccs are based on TCP (Reno, Cubi, bbr, dctcp), however we don’t want to exclude RoCE either at this point :( ...
Besides, Mellanox provides really good smartNIC design. And very recently, they also announced programmable congestion control on their NICs. At this point, it might be a good timing to discuss with the broader industry on this…

> 2. How does the network participate in CC for HPC/RDMA/AI DCN?
>
> There are a few items for discussion.
>
> 2.1 AI ECN
>
> Discuss feedback on https://datatracker.ietf.org/doc/draft-zhuang-tsvwg-ai-ecn-for-dcn/.  The idea is to use AI for adaptive configuration of the network - a hard problem.  How is necessary information collected from the devices to form models and what could/should be standardized here as well?

I don't see a proposal here, I don't even see a concrete problem statement? This is yet another "let's throw AI at it" three-pager.
[Y] AI might not be a technical wording nowadays due to ai everywhere…however, we do provide a scene-based ECN reconfiguration to adapt the changes of traffics in data center networks in which scene training and dynamic scene inducing is where AI technologies are applied…and we don't want to limit any specific ai technologies, like deep-forest or et al...
We would share some testing results in the side meeting ...to show it is not another ai paper work...hope it helps.
The problem is stated in background section, however it might not be so obvious… “As stated in [RFC7567], with proper parameters, RED can be an effective algorithm.  However, dynamically predicting the set of parameters (minimum threshold and maximum threshold) is difficult. ” Dynamic configuration of threshold is a problem for network configuration somehow…

> 2.2 Network Fast Feedback
>
> Discuss follow-on feedback on MailScanner has detected a possible fraud attempt from "tools.ietf.org" claiming to be https://tools.ietf..org/html/draft-even-iccrg-dc-fast-congestion-00 which is expected to be introduced in ICCRG on Monday.  The draft discusses the state-of-the-art congestion controllers in use and from research, and poses a number of questions for discussion. What is to be researched and what could/should be standardized going forward?

This is the beginnings of a survey. It misses a ton of related work esp. from academia though. HOMA, pFabric, HULL, D3, PDQ, pHost, NDP, etc., etc.

> 2.3 Mixing RDMA and TCP traffic
>
> These two traffic types with their differing congestion controllers are known to not play well with one another in the same traffic class.  There may be some analysis data to share on this topic.  A goal would be to discuss network approaches for mitigating the impact of the two on each other.

When you say RDMA, you mean RoCE? Separate RoCE into a slice and move on. It's pointless to try and optimize for coexistence with a protocol that can change willy-nilly.
[Y] yes, it means RoCE. If the network does not differentiate ROCE and TCP traffics, then they would compete anyhow…L4S might work on a similar work on classic TCP vs. DCTCP.

> 3. Metrics for HPC/RDMA/AI networks
>
> Are the current metrics and scales appropriate for HPC/RDMA/AI networks?  HPC and Storage networks tend to use IOPS as a key measure and the latency requirements can be on the order of 10us; much different than Internet latency and throughput measures.  Should there be a draft on metric requirements for DCN networks?   Can we work with real customers to define some well-known scenarios and metrics for HPC/RDMA/AI DCNs.

Whose "current metrics and scales"? Papers on DC mechanisms certainly define appropriate metrics. Obviously Internet scales don't work, but who is using those?
[Y] To my understanding, current metrics in ietf are mostly discussed about Internet. The question is whether it is worthy dicussing and getting consenus on some common metrics for applications in DCN especially HPC. It is an open discussion to seek feedbacks :)

Lars
--
Rdma-cc-interest mailing list
Rdma-cc-interest@ietf.org<mailto:Rdma-cc-interest@ietf.org>
https://www.ietf.org/mailman/listinfo/rdma-cc-interest