Re: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106

"Black, David" <David.Black@dell.com> Wed, 13 November 2019 06:06 UTC

Return-Path: <David.Black@dell.com>
X-Original-To: rdma-cc-interest@ietfa.amsl.com
Delivered-To: rdma-cc-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DF7B012011B for <rdma-cc-interest@ietfa.amsl.com>; Tue, 12 Nov 2019 22:06:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.698
X-Spam-Level:
X-Spam-Status: No, score=-2.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=dell.com header.b=Q+VsnMMz; dkim=pass (1024-bit key) header.d=dell.onmicrosoft.com header.b=GwxGVamr
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YnTQZ8p0A4u8 for <rdma-cc-interest@ietfa.amsl.com>; Tue, 12 Nov 2019 22:06:01 -0800 (PST)
Received: from mx0b-00154904.pphosted.com (mx0b-00154904.pphosted.com [148.163.137.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6BB4F12010E for <rdma-cc-interest@ietf.org>; Tue, 12 Nov 2019 22:06:01 -0800 (PST)
Received: from pps.filterd (m0170398.ppops.net [127.0.0.1]) by mx0b-00154904.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id xAD61rNq028815; Wed, 13 Nov 2019 01:05:50 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dell.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : mime-version; s=smtpout1; bh=wphKHAkMkCEVD7tDShwvqr5PxdVPD/VaQC9OA3fV0CY=; b=Q+VsnMMzA1ytacSZOdmUAmJvwR+zyzSYr3GYdBOLk6BkyMogYOQcmIvWRqJC4RR5G+zI 5xwQW3XmRA9uQ9ok9B0835OQO23Art7wG8vRxJdn0yJF6lhWYBoam6npQbDslHysMYE/ U6qfetE/EUh0Tsfm0Yl1BRZ3/AVW466uYYfrko+V2f+k335atj4K9xiboGEYeR1TSFL4 IueFqmLyEqMdzN8l6oHuLZcXYvqmgVV0BbpxghvcjmUaolxhiGbVI9mc4FZstkssZwH4 JnQHrharYMlZrV30Thpg/d1I7HY0KjRmTnVFO1xmu1JXD88AdvjKjrZ3TAVhSac6Uf/e /w==
Received: from mx0a-00154901.pphosted.com (mx0a-00154901.pphosted.com [67.231.149.39]) by mx0b-00154904.pphosted.com with ESMTP id 2w7pqfwfxv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 13 Nov 2019 01:05:49 -0500
Received: from pps.filterd (m0142693.ppops.net [127.0.0.1]) by mx0a-00154901.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id xAD64Ui3027369; Wed, 13 Nov 2019 01:05:48 -0500
Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-00154901.pphosted.com with ESMTP id 2w7qaw7a52-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 13 Nov 2019 01:05:48 -0500
Received: from m0142693.ppops.net (m0142693.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id xAD65e9g030827; Wed, 13 Nov 2019 01:05:47 -0500
Received: from nam04-co1-obe.outbound.protection.outlook.com (mail-co1nam04lp2052.outbound.protection.outlook.com [104.47.45.52]) by mx0a-00154901.pphosted.com with ESMTP id 2w7qaw7a4t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 13 Nov 2019 01:05:47 -0500
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QThOu1L2znMMyblTa1+oqaI+62FtzGZRYOde1T2bsCsaz9gxoK1+uaY0LSDFynnGM4nkOUSNFxdgi1h+CJYAETOdjYMaJpl9Zj6EMGfPqG1ZcyfD2E7yEWnsIWgnWCCUQlGu6ntyb+iCtWijzj9gS87WqhGu7BEUHlCNR/RavP7I1qENo+Er2Fp5BfHEz6GPUnWy1535QQQPlQarUF07GXO6iTLMgzZfxRWLnv45cFY3D0vahxZSYH240en+kK5oeEKfvOJPUrT2kGueR70sMAk9YnAsjEMV86D7SSpN9OrbES8mH4DONZOLopqYEiI0gq+LEDbWSRWu1csJ7AkKoA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wphKHAkMkCEVD7tDShwvqr5PxdVPD/VaQC9OA3fV0CY=; b=S3twGBWPctB+p1pYWjKVzCdx09ruYLN4l9vK/lleK/k4YRQiJeZLyp4oxXbruRbYJYnmKRqpeZV2cW3tFRLXN8ZAHugRKUc24jZn/9EVpt+wN1Jztxx/q4Ll2FKDCAKXRDn9C4bBvUw5mdjErIto1cg9S3IvEAz34qmvPfeZfxP3gLauHWxEp9NoTcLe63aRdKTDVnf6FjCTb6I6SlDIoixK0qpkVPP1BYYfrTsFP5sB94t/X1GOKQH3WDjN2HKjb7ygyznOFUp6SCrbsxq/EQm4lZ2EALaJIXNhXzHd1MmJAfhGXDxqhQFmyO5B/zBuF96YRSADtIhqpo5zCoW/Jg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=dell.com; dmarc=pass action=none header.from=dell.com; dkim=pass header.d=dell.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Dell.onmicrosoft.com; s=selector1-Dell-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wphKHAkMkCEVD7tDShwvqr5PxdVPD/VaQC9OA3fV0CY=; b=GwxGVamrs4fUvcY05Mb1CaEdQggIjR1AsWC+2XY3nXV/WCnTxASmhjw4RvfFxK6x0w7ucwfOOTGDMav+3lgLUOTbv6Klbzw3GWFvDVSBykKCDd5d0PsRZoOixOdURM+9Jk0U+c6YZAPUJLaAIfy+9k4QvXiN0UY/OonJdlzF4HQ=
Received: from MN2PR19MB4045.namprd19.prod.outlook.com (10.186.145.137) by MN2PR19MB3136.namprd19.prod.outlook.com (10.255.181.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2430.20; Wed, 13 Nov 2019 06:05:45 +0000
Received: from MN2PR19MB4045.namprd19.prod.outlook.com ([fe80::8893:d435:ce32:3594]) by MN2PR19MB4045.namprd19.prod.outlook.com ([fe80::8893:d435:ce32:3594%6]) with mapi id 15.20.2430.027; Wed, 13 Nov 2019 06:05:45 +0000
From: "Black, David" <David.Black@dell.com>
To: "Zhuangyan (Yan)" <zhuangyan.zhuang@huawei.com>, Lars Eggert <lars@eggert.org>, Paul Congdon <paul.congdon@tallac.com>
CC: "rdma-cc-interest@ietf.org" <rdma-cc-interest@ietf.org>, "Black, David" <David.Black@dell.com>
Thread-Topic: =?utf-8?B?W1JkbWEtY2MtaW50ZXJlc3RdIOetlOWkjTogIFNpZGUgbWVldGluZyBwbGFu?= =?utf-8?Q?s_at_IETF-106?=
Thread-Index: AQHVmbDfvRgfDVjBrkO6ahCRXZA7yaeIOSpAgAAPrwCAACtn8IAAFO6AgAATw8A=
Date: Wed, 13 Nov 2019 06:05:45 +0000
Message-ID: <MN2PR19MB404515844EBB1D3C474B1FAC83760@MN2PR19MB4045.namprd19.prod.outlook.com>
References: <CAAMqZPu6g56PotHQJcn6vvoex3=EPomCTgrmMm8jo3ozehG-WQ@mail.gmail.com>, <1605A4E1-7C7C-4BBD-BE35-960730A678D0@eggert.org> <326e89210c104ec6856152d9a76553fb@huawei.com>, <MN2PR19MB4045F5565E6F210A0130FB8583760@MN2PR19MB4045.namprd19.prod.outlook.com> <eff21634c3a9470f9aa3963e0513fcb3@huawei.com>, <MN2PR19MB4045C6061F6214E505335F6A83760@MN2PR19MB4045.namprd19.prod.outlook.com> <080a11d6875a49f6b775d17097dcf8d6@huawei.com>
In-Reply-To: <080a11d6875a49f6b775d17097dcf8d6@huawei.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
msip_labels: MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Enabled=True; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_SiteId=945c199a-83a2-4e80-9f8c-5a91be5752dd; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Owner=david.black@emc.com; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_SetDate=2019-11-13T06:05:43.1967034Z; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Name=External Public; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Application=Microsoft Azure Information Protection; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Extended_MSFT_Method=Manual; aiplabel=External Public
x-originating-ip: [12.131.214.126]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: ecf16370-cb62-41c9-0de8-08d767ff85a2
x-ms-traffictypediagnostic: MN2PR19MB3136:
x-ms-exchange-purlcount: 3
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <MN2PR19MB3136169F0CBA8AF961EB6B4583760@MN2PR19MB3136.namprd19.prod.outlook.com>
x-exotenant: 2khUwGVqB6N9v58KS13ncyUmMJd8q4
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 0220D4B98D
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(346002)(136003)(39860400002)(366004)(376002)(199004)(189003)(54094003)(7502003)(66574012)(11346002)(66446008)(30864003)(5660300002)(14444005)(4001150100001)(790700001)(3846002)(256004)(33656002)(7696005)(561944003)(52536014)(76116006)(8936002)(2906002)(64756008)(66556008)(81156014)(66476007)(66946007)(6116002)(81166006)(224303003)(66066001)(71200400001)(71190400001)(53546011)(478600001)(6506007)(6436002)(6246003)(786003)(99286004)(4326008)(54906003)(102836004)(606006)(9686003)(54896002)(186003)(107886003)(55016002)(86362001)(316002)(25786009)(76176011)(14454004)(6306002)(110136005)(236005)(26005)(229853002)(476003)(446003)(486006)(7736002)(74316002)(966005)(579004); DIR:OUT; SFP:1101; SCL:1; SRVR:MN2PR19MB3136; H:MN2PR19MB4045.namprd19.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1;
received-spf: None (protection.outlook.com: dell.com does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: oYHXMUOiyzFrdGr+b6b5BYyYunYXS/efPTALY5drYV8wYCZxRNKqE9kzlCYtm32fYcQ1vdpJgJLPlCpkJCERp+oWydoJux7OTlvyURdq+4oBfgsqZPQvrv0HrkKJNuMYD+P1mvCsMj8b4Uxj7aVQA4J1/J3Y08Ytpf0JhJCOE7gb/2sBXqscU2fUgkMyu4pFjrfsVVDAz3kXMEUJSs06VkmpZSoV2KAAD2/3BXagFdfzHRAhOaiGCWux8H9MwhPqWrbmrrmhbYnQsG0IbCxFs04AXZbUxFZSrxblZ37kDGnFd7HsMZ7JolfjiyfWX35S0Grwf6cE83hX2/Eb0JzhWAWhYA8HXYTUAzVtHpSLASU5BwnT20PP1apOeZpuQTYnCjCqHMT6P6s2lTgum1aqBPcngbgXwIQ3L5IHZUVAFCFkG9G3tskIFewBLrTMVUDvnVe0YBa9+AYEbFWRg8+4RexgAQhMm2tft6fFyRBIYAg=
Content-Type: multipart/alternative; boundary="_000_MN2PR19MB404515844EBB1D3C474B1FAC83760MN2PR19MB4045namp_"
MIME-Version: 1.0
X-OriginatorOrg: Dell.com
X-MS-Exchange-CrossTenant-Network-Message-Id: ecf16370-cb62-41c9-0de8-08d767ff85a2
X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Nov 2019 06:05:45.1371 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 945c199a-83a2-4e80-9f8c-5a91be5752dd
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: HYx76kgKo/R75tPGzY5n+GQEg/eZpU+9QkY0VTIHbqO09tbDlSqMU4OF2xGJVR8HLgeE8EGCannzqID5yHKSOw==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR19MB3136
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.95,18.0.572 definitions=2019-11-13_01:2019-11-11,2019-11-13 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 lowpriorityscore=0 clxscore=1015 priorityscore=1501 adultscore=0 suspectscore=0 bulkscore=0 malwarescore=0 phishscore=0 spamscore=0 impostorscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1910280000 definitions=main-1911130054
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 priorityscore=1501 mlxscore=0 suspectscore=0 lowpriorityscore=0 impostorscore=0 phishscore=0 clxscore=1015 adultscore=0 malwarescore=0 bulkscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1910280000 definitions=main-1911130054
Archived-At: <https://mailarchive.ietf.org/arch/msg/rdma-cc-interest/upFVk1v9g31iEc2Dx-su84ErpcA>
Subject: Re: [Rdma-cc-interest] =?utf-8?b?562U5aSNOiAgU2lkZSBtZWV0aW5nIHBs?= =?utf-8?q?ans_at_IETF-106?=
X-BeenThere: rdma-cc-interest@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Congestion Control for Large Scale HPC/RDMA Data Centers <rdma-cc-interest.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rdma-cc-interest>, <mailto:rdma-cc-interest-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rdma-cc-interest/>
List-Post: <mailto:rdma-cc-interest@ietf.org>
List-Help: <mailto:rdma-cc-interest-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rdma-cc-interest>, <mailto:rdma-cc-interest-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Nov 2019 06:06:06 -0000

See Lars’s comments about paying less attention to RoCE – work on that non-IETF protocol belongs in the IBTA (InfiniBand Trade Association), which specifies that protocol.

Thanks, --David

From: Zhuangyan (Yan) <zhuangyan.zhuang@huawei.com>
Sent: Tuesday, November 12, 2019 11:53 PM
To: Black, David; Lars Eggert; Paul Congdon
Cc: rdma-cc-interest@ietf.org
Subject: 答复: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106


[EXTERNAL EMAIL]

Hi David,



The current practice is based on RoCE traffic : ). So we choose RED for ECN marking to avoid packet loss. Not so sure about how to use fq-codel in this case and would be happy to discuss. Alternatively, we can try tcp traffic as well.



Best Regards,



Yan



________________________________
发件人: Black, David <David.Black@dell.com<mailto:David.Black@dell.com>>
发送时间: 2019年11月13日 11:40
收件人: Zhuangyan (Yan); Lars Eggert; Paul Congdon
抄送: rdma-cc-interest@ietf.org<mailto:rdma-cc-interest@ietf.org>; Black, David
主题: RE: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106


> [Y] Since it is a startpoint to see how adaptive/dyamic configuration can help, so we just use RED as

> the one to see whether it can be improved. Other algorithms can also be tested but would wait for

> next meeting cycle...the results to be presented are RED only...



Doing better than RED is of limited interest, IMHO.



Thanks, --David



From: Zhuangyan (Yan) <zhuangyan.zhuang@huawei.com<mailto:zhuangyan.zhuang@huawei.com>>
Sent: Tuesday, November 12, 2019 8:03 PM
To: Black, David; Lars Eggert; Paul Congdon
Cc: rdma-cc-interest@ietf.org<mailto:rdma-cc-interest@ietf.org>
Subject: 答复: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106



[EXTERNAL EMAIL]

Hi David,



Thank you for the comments. Some responses as below.



Best Regards,



Yan

________________________________

发件人: Black, David <David.Black@dell.com<mailto:David.Black@dell.com>>
发送时间: 2019年11月13日 8:39
收件人: Zhuangyan (Yan); Lars Eggert; Paul Congdon
抄送: rdma-cc-interest@ietf.org<mailto:rdma-cc-interest@ietf.org>; Black, David
主题: RE: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106



A few follow-up comments.



>>> 2.1 AI ECN
>>>

>>> Discuss feedback on https://datatracker.ietf.org/doc/draft-zhuang-tsvwg-ai-ecn-for-dcn/.

>> I don't see a proposal here, I don't even see a concrete problem statement? This is yet another "let's throw AI at it" three-pager.



[… snip …]



> The problem is stated in background section, however it might not be so obvious… “As stated in [RFC7567], with proper parameters,

> RED can be an effective algorithm.  However, dynamically predicting the set of parameters (minimum threshold and maximum threshold)

> is difficult. ” Dynamic configuration of threshold is a problem for network configuration somehow…



That sounds like a straw-person demolition exercise, as RFC 7567 also states:



   This memo also explicitly obsoletes the recommendation that Random

   Early Detection (RED) be used as the default AQM mechanism for the

   Internet.  This is replaced by a detailed set of recommendations for

   selecting an appropriate AQM algorithm.



The upshot is that RED is not a good AQM algorithm to use as a comparison baseline - I hope that the results to be presented will make comparisons to more recent AQM algorithms, e.g., FQ-CoDel [RFC8290].

[Y] Since it is a startpoint to see how adaptive/dyamic configuration can help, so we just use RED as the one to see whether it can be improved. Other algorithms can also be tested but would wait for next meeting cycle...the results to be presented are RED only...



>>> 2.3 Mixing RDMA and TCP traffic
>>>
>>> These two traffic types with their differing congestion controllers are known to not play well with one another in the same traffic class.



Using protocol names to denote congestion control classes does not work well, even though it’s common (and I’ve done it myself).



We are dealing with two clases of congestion controls.  For lack of better terms the following class names are based on what the transport protocol throughput is proportional to where ‘p’ is the loss and/or congestion marking probability:

               - 1/sqrt(p)-class congestion controls: Includes most existing TCP congestion control algorithms, e.g., NewReno, CUBIC.

               - 1/p-class congestion controls: Includes DCTCP congestion control.

Keep in mind that p is a probability that is usually << 1 when expresed as a decimal, e.g., p=0.01 represents a 1% loss/marking rate.

>> When you say RDMA, you mean RoCE? Separate RoCE into a slice and move on. It's pointless to try and optimize for coexistence with a protocol that can change willy-nilly.

> [Y] yes, it means RoCE. If the network does not differentiate ROCE and TCP traffics, then they would compete anyhow…L4S might work on a similar work on classic TCP vs. DCTCP.



DCTCP would be a better protocol to focus on than RoCE, as both DCTCP congestion control and the DCQCN congestion control commonly used for RoCE are 1/p-class congestion controls.

[Y] sure, we can see how DCTCP works which might also be used in DCQCN.



TSVWG will be discussing L4S and SCE next week – both of those proposals are intended to enable coexistence of 1/sqrt(p)-class and 1/p-class congestion controls, among other goals.



Thanks, --David



From: Rdma-cc-interest <rdma-cc-interest-bounces@ietf.org<mailto:rdma-cc-interest-bounces@ietf.org>> On Behalf Of Zhuangyan (Yan)
Sent: Tuesday, November 12, 2019 6:28 PM
To: Lars Eggert; Paul Congdon
Cc: rdma-cc-interest@ietf.org<mailto:rdma-cc-interest@ietf.org>
Subject: [Rdma-cc-interest] 答复: Side meeting plans at IETF-106



[EXTERNAL EMAIL]

Hi Lars,



Thank you for the review and detailed comments. Some responses inline at [Y].



Best Regards,



Yan

________________________________

发件人: Rdma-cc-interest <rdma-cc-interest-bounces@ietf.org<mailto:rdma-cc-interest-bounces@ietf.org>> 代表 Lars Eggert <lars@eggert.org<mailto:lars@eggert.org>>
发送时间: 2019年11月12日 19:35
收件人: Paul Congdon
抄送: rdma-cc-interest@ietf.org<mailto:rdma-cc-interest@ietf.org>
主题: Re: [Rdma-cc-interest] Side meeting plans at IETF-106



Hi,

On 2019-11-11, at 21:11, Paul Congdon <paul.congdon@tallac.com<mailto:paul.congdon@tallac.com>> wrote:
> Tuesday, November 19
> 8:30AM - 9:45AM
> Room: VIP A

I'll try and make the meeting, modulo NomCom duties. (Please all, send your feedback about candidates to NomCom now!)

Because I am not fully sure I can make this, some written feedback on the various agenda items:

> 1. How NICs can be designed for better CC in the HPC/RDMA/AI DCN
>
> Discuss feedback on a draft under development on OpenCC: https://datatracker.ietf.org/doc/draft-zhh-tsvwg-open-architecture/; a framework for flexible establishment of congestion control algorithms implemented by NICs and the network.  The expectation is there will be some experiment results.  The goal is to discuss the ideas with stakeholders (customers, NIC vendors, switch vendors) and explore what could/should be standardized.

There seem to be three things here:

A. A modular NIC offload interface. Unclear if the IETF is the right home for this? Also, see "Restructuring Endpoint Congestion Control" (Narayan et al.) for another direction.
[Y] yes, papers from Sigcomm (or related conferences) have discussed about decoupling congestion controls on NICs a bit, like CCP, AD/DC TCP et al., which we think it might be good for a discussion in the industry for a modular cc design : ), however we might be more focused on the interfaces between layers while the implementation of modules would be open to vendors. It could be discussed further.

B. Mixing and matching different CCs in one network over time. Given that at datacenter latencies, you really want to prevent even small-scale hiccups due to interactions between different CCs, I wonder if it would be sufficient to slice the NIC and have different slices where all traffic is handled by one CC? Seems more tractable.

[Y] Slicing the nic rather than different cc algorithms would also be an option. We also plan on experiments of different configurations of one cc for different traffics which might be in later version. Based on results in this version, 2 cc algorithms performs better than one cc for two types of traffics. It might be due to different feedbacks/reacts in the network.

C. The architectural idea to move away from in-band CC signaling from the network to the endpoints. There isn't much in the document that motivates this, and nothing about potential issues (e.g., loss of fate-sharing).
[Y] The architecture is intended to support both signaling directly from network (rather from receivers--is that what you mean in-band CC signaling?) and signaling from receivers (current practice). Each CC chooses their own ways of signaling and sure it would affect each other somehow if different CCs are used together while their signaling is different. More details will be discussed in later version and inputs/issues that we missed are greatly welcome.

A comment on this document that is really about this entire effort: we should just give up on RoCE. Mellanox has no interest in opening it, and I am therefore unwilling to spend cycles thinking about it.
[Y] :) Actually, the architecture is not binding to RoCE. The thought is to support several transport protocols including TCP. Current experiment results of different ccs are based on TCP (Reno, Cubi, bbr, dctcp), however we don’t want to exclude RoCE either at this point :( ...
Besides, Mellanox provides really good smartNIC design. And very recently, they also announced programmable congestion control on their NICs. At this point, it might be a good timing to discuss with the broader industry on this…

> 2. How does the network participate in CC for HPC/RDMA/AI DCN?
>
> There are a few items for discussion.
>
> 2.1 AI ECN
>
> Discuss feedback on https://datatracker.ietf.org/doc/draft-zhuang-tsvwg-ai-ecn-for-dcn/.  The idea is to use AI for adaptive configuration of the network - a hard problem.  How is necessary information collected from the devices to form models and what could/should be standardized here as well?

I don't see a proposal here, I don't even see a concrete problem statement? This is yet another "let's throw AI at it" three-pager.
[Y] AI might not be a technical wording nowadays due to ai everywhere…however, we do provide a scene-based ECN reconfiguration to adapt the changes of traffics in data center networks in which scene training and dynamic scene inducing is where AI technologies are applied…and we don't want to limit any specific ai technologies, like deep-forest or et al...

We would share some testing results in the side meeting ...to show it is not another ai paper work...hope it helps.
The problem is stated in background section, however it might not be so obvious… “As stated in [RFC7567], with proper parameters, RED can be an effective algorithm.  However, dynamically predicting the set of parameters (minimum threshold and maximum threshold) is difficult. ” Dynamic configuration of threshold is a problem for network configuration somehow…

> 2.2 Network Fast Feedback
>
> Discuss follow-on feedback on MailScanner has detected a possible fraud attempt from "tools.ietf.org" claiming to be https://tools.ietf..org/html/draft-even-iccrg-dc-fast-congestion-00 which is expected to be introduced in ICCRG on Monday.  The draft discusses the state-of-the-art congestion controllers in use and from research, and poses a number of questions for discussion. What is to be researched and what could/should be standardized going forward?

This is the beginnings of a survey. It misses a ton of related work esp. from academia though. HOMA, pFabric, HULL, D3, PDQ, pHost, NDP, etc., etc.

> 2.3 Mixing RDMA and TCP traffic
>
> These two traffic types with their differing congestion controllers are known to not play well with one another in the same traffic class.  There may be some analysis data to share on this topic.  A goal would be to discuss network approaches for mitigating the impact of the two on each other.

When you say RDMA, you mean RoCE? Separate RoCE into a slice and move on. It's pointless to try and optimize for coexistence with a protocol that can change willy-nilly.

[Y] yes, it means RoCE. If the network does not differentiate ROCE and TCP traffics, then they would compete anyhow…L4S might work on a similar work on classic TCP vs. DCTCP.

> 3. Metrics for HPC/RDMA/AI networks
>
> Are the current metrics and scales appropriate for HPC/RDMA/AI networks?  HPC and Storage networks tend to use IOPS as a key measure and the latency requirements can be on the order of 10us; much different than Internet latency and throughput measures.  Should there be a draft on metric requirements for DCN networks?   Can we work with real customers to define some well-known scenarios and metrics for HPC/RDMA/AI DCNs.

Whose "current metrics and scales"? Papers on DC mechanisms certainly define appropriate metrics. Obviously Internet scales don't work, but who is using those?
[Y] To my understanding, current metrics in ietf are mostly discussed about Internet. The question is whether it is worthy dicussing and getting consenus on some common metrics for applications in DCN especially HPC. It is an open discussion to seek feedbacks :)

Lars
--
Rdma-cc-interest mailing list
Rdma-cc-interest@ietf.org<mailto:Rdma-cc-interest@ietf.org>
https://www.ietf.org/mailman/listinfo/rdma-cc-interest