[Rdma-cc-interest] Congestion Control for Large Scale Data Center Networks

"Roni Even (A)" <roni.even@huawei.com> Tue, 24 September 2019 07:07 UTC

Return-Path: <roni.even@huawei.com>
X-Original-To: rdma-cc-interest@ietfa.amsl.com
Delivered-To: rdma-cc-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BAF2B120115 for <rdma-cc-interest@ietfa.amsl.com>; Tue, 24 Sep 2019 00:07:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WLB__8_YW3wv for <rdma-cc-interest@ietfa.amsl.com>; Tue, 24 Sep 2019 00:07:41 -0700 (PDT)
Received: from huawei.com (lhrrgout.huawei.com [185.176.76.210]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8E03012008B for <rdma-cc-interest@ietf.org>; Tue, 24 Sep 2019 00:07:41 -0700 (PDT)
Received: from lhreml708-cah.china.huawei.com (unknown [172.18.7.106]) by Forcepoint Email with ESMTP id D721C7AE7837A66B804B for <rdma-cc-interest@ietf.org>; Tue, 24 Sep 2019 08:07:39 +0100 (IST)
Received: from lhreml729-chm.china.huawei.com (10.201.108.80) by lhreml708-cah.china.huawei.com (10.201.108.49) with Microsoft SMTP Server (TLS) id 14.3.408.0; Tue, 24 Sep 2019 08:07:38 +0100
Received: from lhreml729-chm.china.huawei.com (10.201.108.80) by lhreml729-chm.china.huawei.com (10.201.108.80) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 24 Sep 2019 08:07:37 +0100
Received: from DGGEMM423-HUB.china.huawei.com (10.1.198.40) by lhreml729-chm.china.huawei.com (10.201.108.80) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.1.1713.5 via Frontend Transport; Tue, 24 Sep 2019 08:07:37 +0100
Received: from DGGEMM506-MBX.china.huawei.com ([169.254.3.207]) by dggemm423-hub.china.huawei.com ([10.1.198.40]) with mapi id 14.03.0439.000; Tue, 24 Sep 2019 15:07:32 +0800
From: "Roni Even (A)" <roni.even@huawei.com>
To: "rdma-cc-interest@ietf.org" <rdma-cc-interest@ietf.org>
Thread-Topic: Congestion Control for Large Scale Data Center Networks
Thread-Index: AdVypdasUsSCz31jTEKThKlG/zI4pA==
Date: Tue, 24 Sep 2019 07:07:32 +0000
Message-ID: <6E58094ECC8D8344914996DAD28F1CCD23D6B64E@DGGEMM506-MBX.china.huawei.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.200.202.58]
Content-Type: multipart/alternative; boundary="_000_6E58094ECC8D8344914996DAD28F1CCD23D6B64EDGGEMM506MBXchi_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/rdma-cc-interest/D9mStlflPkNEXDgpg9KpltR1Qdc>
Subject: [Rdma-cc-interest] Congestion Control for Large Scale Data Center Networks
X-BeenThere: rdma-cc-interest@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Congestion Control for Large Scale HPC/RDMA Data Centers <rdma-cc-interest.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rdma-cc-interest>, <mailto:rdma-cc-interest-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rdma-cc-interest/>
List-Post: <mailto:rdma-cc-interest@ietf.org>
List-Help: <mailto:rdma-cc-interest-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rdma-cc-interest>, <mailto:rdma-cc-interest-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Sep 2019 07:07:44 -0000

Hi,
We had side meetings at the last two IETF meeting about better congestion control for data centers with good number of participants.

The high throughput of the Data center networks require a good congestion control for data centers should provide low latency, fast convergence and high link utilization.  Since multiple  applications with different requirements may run on the DC network it  is important to provide fairness between different applications that may use different congestion algorithms.  An important issue from the  user perspective is to achieve short Flow Completion Time (FCT).

It is clear that we are not going to make changes to ROCE but we still would like to look at good e2e congestion control. Currently there are multiple published work presented in different papers (examples are DCQCN, HPCC, RPC) but it will be good if the IETF will be able to recommend an e2e congestion protocol that will allow interoperability between vendors and that will leverage the information from the network .

We would like to have a side meeting in Singapore and intend to submit a draft that will discussed the different options hoping to be able to collaborate between the interested people to work on a common direction.

Regards
Roni Even