[Rdma-cc-interest] Minutes of the side meeting at IETF-106

Paul Congdon <paul.congdon@tallac.com> Tue, 03 December 2019 19:55 UTC

Return-Path: <paul.congdon@tallac.com>
X-Original-To: rdma-cc-interest@ietfa.amsl.com
Delivered-To: rdma-cc-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6D9C5120072 for <rdma-cc-interest@ietfa.amsl.com>; Tue, 3 Dec 2019 11:55:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.119
X-Spam-Level:
X-Spam-Status: No, score=-1.119 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=tallac-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id inwUwjx5T7uC for <rdma-cc-interest@ietfa.amsl.com>; Tue, 3 Dec 2019 11:55:31 -0800 (PST)
Received: from mail-ot1-x32a.google.com (mail-ot1-x32a.google.com [IPv6:2607:f8b0:4864:20::32a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A078412004C for <rdma-cc-interest@ietf.org>; Tue, 3 Dec 2019 11:55:31 -0800 (PST)
Received: by mail-ot1-x32a.google.com with SMTP id p8so4042978oth.10 for <rdma-cc-interest@ietf.org>; Tue, 03 Dec 2019 11:55:31 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tallac-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=qoHZ1GE177NfNKQxPJk4TwR1DCzgsbSreaZ5Y5dRsv8=; b=w2gr6Y+0WN5ze8BoF5aHJ6IpCTnbPs8b9JeKQ2e0yseZkZRy87kKgCctdT7DJCc97m uY+Tmz6Qp8Qei6FJ+hL66D3X0qNfiAGE19nhfzeNxn6OAKOXYub6ay2+wva6x54bgNkJ ZVlB/T2BLGxB89Ij0BzxY/7C3SMZW+S2mUvRIqQdA05nPLkhJonXItbt8Q0tIrTqF6cC slvMudwghjlFRvu8X40Z4vLiqtEOCbgFoicvLTKZPu553jZr1lfykMl7m2myeCzl9KVo +3ap7h2YG67yxL4Actaz7W3cE6JmV0XT2jDEhtQeXl06u7ZtRoU//Yl2fOIohIhIG03Z VLVQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=qoHZ1GE177NfNKQxPJk4TwR1DCzgsbSreaZ5Y5dRsv8=; b=qhljVcOrWImmPs+7Jpfj1SAK/rU2HWOmdgwPHZIevg3CHGv5RJBwCkwnFH9skt94PQ ARObg9MZP4/Jogk9S8Fcu4zGGjiSVF/VAyAFHZe5D5hfRqivX2CqPHwH2/JwGAW8t67G 1vOU4WhWsijznINyZQvVL254bSqpNXHBWyJ7WtC8V4EvYhCFdUUXtZt1qht1ipA2I3Pj FGO/nJ+eYrxBq/Q0A/aw0s9sDScb99GCMS3kWoy+e5pDBoLaNjogjKP8bbnn/XoIZDzH KbvbT5QRYlYfKi9VqfvPUIBvdO7dpo4VAIFFlM2n/yjWV+/MJkng5u6hXThZHASSSDr/ 7ibw==
X-Gm-Message-State: APjAAAUPJMXJ/L5SPPGTZgTijGNiR8qWll2DARlxq55c8Bi6zk1vvSB5 vKLV+9W82tHbZypavYWUSvX9DIm33Whn0FoyPKDWHEwRYsNyOA==
X-Google-Smtp-Source: APXvYqxVcxfF7G0zsIoL9TXX0pUKscb/lVP6n/56E978aBMmh7IJRlNcrpwkPyGjtRR5WUu7Ngtz2l0yx3857ZKmyAI=
X-Received: by 2002:a05:6830:2335:: with SMTP id q21mr4492630otg.237.1575402930329; Tue, 03 Dec 2019 11:55:30 -0800 (PST)
MIME-Version: 1.0
From: Paul Congdon <paul.congdon@tallac.com>
Date: Tue, 03 Dec 2019 11:55:19 -0800
Message-ID: <CAAMqZPues-kUGd6Cn7Di3Ks2SHze+07JpBts9o-txMW=CV25-g@mail.gmail.com>
To: rdma-cc-interest@ietf.org
Content-Type: multipart/alternative; boundary="0000000000000a6edb0598d21408"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rdma-cc-interest/tj6j4kXvf0vvE1RaPyobKmaa7Qw>
Subject: [Rdma-cc-interest] Minutes of the side meeting at IETF-106
X-BeenThere: rdma-cc-interest@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Congestion Control for Large Scale HPC/RDMA Data Centers <rdma-cc-interest.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rdma-cc-interest>, <mailto:rdma-cc-interest-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rdma-cc-interest/>
List-Post: <mailto:rdma-cc-interest@ietf.org>
List-Help: <mailto:rdma-cc-interest-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rdma-cc-interest>, <mailto:rdma-cc-interest-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Dec 2019 19:55:35 -0000

Here are the final minutes as amended by previous comments.  Sorry if this
gets duplicated as some of you are also on the IETF mailing list as well as
the bcc list.  Thank you all for attending and continuing to progress this
work.

IETF-106 Side Meeting:

Data Center Congestion Control – Where’s the best fit in IETF/IRTF?

Tuesday, November 19, 2019

8:30AM – 9:45AM



Attendees (who signed the blue/white sheet or where recognized):

*First*

*Last*

*Affiliation*

Peter

Asai

Preferred Networks / Wide

David

Black

Dell EMC

Roland

Bless

Kit

Zhen

Cao

Huawei

Stuart

Cheshire

Apple

Tim

Costello

BT

Roni

Even

Huawei

Rodney

Grimes

SCE netDEF

Wim

Hendrerickx

Nokia

Jake

Holland

Akamai

Chriatian

Hopps

Lab IU Consulting

Sheng

Jiang

Huawei

Jiao

Kang

Huawei

Mengzhu

Liu

Huawei

David

Melman

Marvell

Tal

Mizrahi

Toga

Mark

Pearson

HPE

Roberto

Peon

Facebook

Jamal Hadi

Salim

Mojatatu

Marcus

Sun

Huawei

Sowmini

Varadhan

Microsoft

Aijun

Wang

China Telecom

Huaru

Yang

Huawei

Yolanda

Yu

Huawei

Yan

Zhaung

Huawei

Ning

Zong

Huawei



Details:

1.    The meeting organizer, Paul Congdon, presented the IETF Note Well
reminder and bashed the agenda.  No changes to the agenda.

2.    The slide material presented at the meeting is available at:
https://mentor.ieee.org/802.1/dcn/19/1-19-0087-00-ICne-ietf-106-sidemeeting.pdf

3.    The slides from IETF-106 HotRFC that announced this side meeting are
available at:
https://datatracker.ietf.org/meeting/106/materials/slides-106-hotrfc-data-center-congestion-control-wheres-the-best-fit-in-ietfirtf

4.    Paul Congdon reviewed the history of previous contributions since
IETF-101.  This is the third side meeting on HPC/RDMA/AI data center
network congestion control.  The last side meeting had mentioned a number
of potential research topics ranging from considering new UDP based
transports for RDMA to ways to improve congestion signal feedback by
providing more inline information or allowing the network to more actively
participate.  The growing number of vendor proprietary solutions hints a
the need for open interoperable standards work.

5.    Roni Even presented on Fast Congestion management for Data Centers
 available at
https://tools.ietf.org/html/draft-even-iccrg-dc-fast-congestion-00

a.    The experimental data was obtained from a simple single switch
topology.  David Black suggested that it will be important to understand
how the solution works in a more complex topology with more than one
bottleneck.  Beware of congestion collapse in this case.

b.    David Black pointed out that ICMP source quench has a bad reputation
in the IETF and it will be very important for this approach to distinguish
itself from that.

c.    David Black also pointed out that many of these environments involve
overlay networks and the solution should not assume that sending a message
back to the source will reliably get the message to the original sender.  A
backward message needs to contain enough information for it to have the
proper impact.

d.    A somewhat related problem is propagating ECN bits from the
encapsulating tunnel to the flows within.  RFC 6040 talks about this as
well as some drafts, but implementation of this has been leisurely.

e.    Jake Holland wanted to understand why ICMP source quench is a
problem.  Is it simply the presence of tunnels?  It wasn’t clear what the
other problems were at the time of the meeting, but this should be
investigated.

f.     Rod Grimes asked if there was any measurement on the number of
control packets were being sent.  The message was modeled after an extended
IBTA CNP (Congestion Notification Packet).  It would be good to understand
the parameters that were used to cause the message to be sent and what is
included in the message.

g.    The summary of recommended actions for this draft include:

                                         i.    Use a more complex topology
with more bottlenecks

                                        ii.    Distinguish the solution
from ICMP source quench

                                       iii.    Consider how this works with
tunnels and encapsulations

                                       iv.    List the parameters that were
used for the test for example threshold time for sending control message.

6.    Yan Zhuang presented An Open Congestion Control Architecture for high
performance fabrics available at:
https://datatracker.ietf.org/doc/draft-zhuang-tsvwg-open-cc-architecture/

a.    Jake Holland indicated the approach was interesting and felt it was
promising.  He asked if there was an existing API or set of operations that
existed to support the experiments.  Yan indicated it was still under
development.  Operations such as pacing and providing an array of flags on
the GRO and more information about packets on the wire would be helpful.
The API will require lots of discussion.

b.    Jamal Hadi Salim pointed out that MIT has a similar approach that was
presented at Sigcomm 2018 and at NetDev in 2018. The NetDev paper was an
engineering paper and discussed APIs. There was a criticism about the
scalability of the approach.

c.    Roberto Peon ask if packet loss evaluated when comparing CCAs?  He
indicated that experiments at 100GbE have shown that bottlenecks exist in
the end-system and without proper feedback they will back off too
aggressively.  Shifting the CC to the NIC could help, but not yet clear.

d.    Roberto Peon stated that shifting decision making to places in the
network where measurements can be made on very short time scales will be
important in the data center.  The longer the delay for a control loop, the
smaller the impact that offloads will have.

e.    The summary of recommended actions for this draft include:

                                         i.    Please provide the details
of the API

7.    Yan Zhuang presented on Artificial Intelligence (AI) based ECN
adaptive reconfiguration for datacenter networks available at
https://datatracker.ietf.org/doc/draft-zhuang-tsvwg-ai-ecn-for-dcn

a.    Stuart Cheshire asked about the use of RED in the test cases and
whether there was an assumption that RED prevented packet loss and CoDel
does not.  If ECN is used correctly, you can prevent packet loss regardless
of using RED or CoDel.  The experiments used RED because that was available
in the commodity switches used in the test.

b.    David Black points out that it is important to use an FQ or a partial
FQ solution here because the experiments are likely showing the
interference between flows.   It is suggested this analysis should be done
using a more modern AQM.  It could be that the optimization is not as
effective if a modern AQM is used.

c.    It was pointed out that CoDel is more burst friendly.  RED is
obsolete.

d.    Jake Holland had a concern about how this works at larger scale.  He
agreed that CoDel would have a different outcome.  There was a question
about the AI model and what constitutes ‘normal’.

e.    The summary of recommended actions for this draft include:

                                         i.    The analysis should be
re-run using a modern AQM – not RED

                                        ii.    Provide details about the
model’s definition of ‘normal’ behavior I

8.    Yolanda Yu presented on The impact of mixing TCP and RoCEv2.  The
slides are in the complete set of slide material at
https://mentor.ieee.org/802.1/dcn/19/1-19-0087-00-ICne-ietf-106-sidemeeting.pdf

a.    David Black made a positive comment about the work, indicating that
more of this type is needed.  He requested that all the settings and
configuration in the test environment be documented in extreme detail.

b.    The summary of recommended actions for this work:

                                         i.    Provide excruciating detail
about the configuration of the test environment

9.    Paul Congdon took the final minutes to ask people how to proceed
within the IETF.  David Black felt the side meeting approach appears to be
working and suggested to continue.  Paul pointed out that, while it is
great that IETF is making the time available for side meetings, there is a
challenge assuring a time is available to meet that doesn’t conflict and
can draw critical attendees.   The meetings are not being recorded and
there isn’t a place to upload the contributions.  This would be an
improvement that IETF could make to the process.

10. The side meeting was adjourned at approximately 9:45AM