[nfsv4] NFS/RDMA next steps
Chuck Lever <chuck.lever@oracle.com> Mon, 31 July 2017 18:34 UTC
Return-Path: <chuck.lever@oracle.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E5338132788 for <nfsv4@ietfa.amsl.com>; Mon, 31 Jul 2017 11:34:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7
X-Spam-Level:
X-Spam-Status: No, score=-7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-2.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yXpPrkq7Kop3 for <nfsv4@ietfa.amsl.com>; Mon, 31 Jul 2017 11:34:34 -0700 (PDT)
Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DCC4D132787 for <nfsv4@ietf.org>; Mon, 31 Jul 2017 11:34:19 -0700 (PDT)
Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v6VIYIaw027121 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for <nfsv4@ietf.org>; Mon, 31 Jul 2017 18:34:19 GMT
Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v6VIYIWP007482 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for <nfsv4@ietf.org>; Mon, 31 Jul 2017 18:34:18 GMT
Received: from abhmp0014.oracle.com (abhmp0014.oracle.com [141.146.116.20]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v6VIYHF0001692 for <nfsv4@ietf.org>; Mon, 31 Jul 2017 18:34:18 GMT
Received: from anon-dhcp-171.1015granger.net (/68.46.169.226) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 31 Jul 2017 11:34:17 -0700
From: Chuck Lever <chuck.lever@oracle.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Message-Id: <53DF3636-D420-4FAA-B1B0-8824602CBB72@oracle.com>
Date: Mon, 31 Jul 2017 14:34:17 -0400
To: NFSv4 <nfsv4@ietf.org>
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
X-Mailer: Apple Mail (2.3124)
X-Source-IP: aserv0021.oracle.com [141.146.126.233]
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/Q9AEeUvqI_l_RWdRQ7ppzoHVLl8>
Subject: [nfsv4] NFS/RDMA next steps
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 31 Jul 2017 18:34:36 -0000
Hi- During the nfsv4 WG meeting at IETF 99, I presented some slides on possible next steps for NFS/RDMA and related protocols. https://datatracker.ietf.org/meeting/99/materials/slides-99-nfsv4-nfsrdma-next-steps-chuck-lever There are many areas that could use some attention, yet only a handful of engineering resources are available. Slides 11 - 18 describe the directions that could IMO be fruitful individual next steps. I hoped my talk could frame a conversation about where we think the highest priority is, but it was decided that there were enough interested and informed people who were not present that such a conversation should be moved to this mailing list and continued after IETF 99. Note that RFC 5667bis is now "Submitted to IESG for Publication". The work items in the slides assume that this document will be completed and published as planned. The slides split the possibilities into three somewhat orthogonal groupings. A fourth grouping arose during discussion in the room, which I'll add as "Grouping Zero" below. We can choose any or all of these approaches. Opinions are welcome as to what order, whether something is left out, or what might be removed from this list. Groupings Two and Three could introduce new Working Group documents, and thus have implications for our charter milestone count. Grouping Zero: Focus on improving existing implementations of RPC-over-RDMA and NFS/RDMA. No IETF action needed, which is why I didn't include this on the slides. There are substantial improvements that can be made to existing base implementations, but these would be done by many of the same folks who would be working on new protocol. Grouping One: Enable greater transport parallelism in NFS. This includes multipathing and use of pNFS with RDMA. No changes to RPC-over-RDMA or NFS/RDMA are necessary, and this would bring important performance capabilities to NFS, especially by enabling very low latency client access to Storage Class Memory. Grouping Two: Incrementally improve RPC-over-RDMA version 1. The main idea here is to introduce a per-connection transport property negotiation mechanism to replace CCP. This would enable variable size (ie larger) inline thresholds and the use of Remote Invalidation in some instances with existing deployments. Grouping Three: Pursue RPC-over-RDMA version 2. This would open a variety of avenues by which many of the perceived shortcomings of RPC-over-RDMA version 1 could be addressed. IMO Zero and One are where we can get the greatest bang for the buck in the near term. Latency to access Storage Class Memory is substantially shorter than the latency of traversing the NFS and RPC stack on just the client. Thus bypassing RPC entirely (eg by using an RDMA layout type) seems like the best strategy we have for tapping the potential of this new variety of durable storage. The current proposal for Grouping Two (draft-cel-nfsv4-rpcrdma-cm-pvt-msg) is controversial. Grouping Three would be an immense amount of work to generalize some things for less gain than we might see with work in Grouping Zero or One. -- Chuck Lever
- [nfsv4] NFS/RDMA next steps Chuck Lever
- Re: [nfsv4] NFS/RDMA next steps David Noveck
- Re: [nfsv4] NFS/RDMA next steps Chuck Lever
- Re: [nfsv4] NFS/RDMA next steps David Noveck