Re: [nfsv4] Expert review of NFSv4 RDMA private data draft (draft-ietf-nfsv4-rpcrdma-cm-pvt-data-05)

Chuck Lever <chuck.lever@oracle.com> Fri, 13 December 2019 18:30 UTC

Return-Path: <chuck.lever@oracle.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7106112011C for <nfsv4@ietfa.amsl.com>; Fri, 13 Dec 2019 10:30:03 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Level:
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=oracle.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LrrW-FfZHXXn for <nfsv4@ietfa.amsl.com>; Fri, 13 Dec 2019 10:30:01 -0800 (PST)
Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EE18712010C for <nfsv4@ietf.org>; Fri, 13 Dec 2019 10:30:00 -0800 (PST)
Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xBDIEvqn160688; Fri, 13 Dec 2019 18:29:59 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=content-type : mime-version : subject : from : in-reply-to : date : cc : content-transfer-encoding : message-id : references : to; s=corp-2019-08-05; bh=WmOj0tpT5hcB7f+3U4YZKCq9syV+Nl/zmU2eRLznhCo=; b=e1hOd3PVXOPaT8f3wB2FvbmHu9h/BPZtwXvXYoSl5NRNE9LGzQOMRIMANT21FylQQiSD CV6eMehhha7QBhzy63lxJWrS6WeaEZdCfoUfRJSgtefN4cpWoK+nNEijmH5vsctXGoFD ITtUSi3/UbWw6+Q9ONuztsiLEaGn7Bi9c/r3YsAES0xzqpP1DkW9AwAfFigQ9ud2U7aS vQqG9Wtjc84HOo5Jb9dSy9D95/3vcLBu0mWTV6GA42xZY//z/FVjgm3ZsfzUzhvOcGeB neawHzaCeJiJpy4suvM8WWeWgGunz5uA0tCugyNG/bMS2aBHCipFjv4nBxaWKOZt+Z4H 0Q==
Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2120.oracle.com with ESMTP id 2wr4qs2kf8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 13 Dec 2019 18:29:59 +0000
Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xBDIEjPe075344; Fri, 13 Dec 2019 18:29:58 GMT
Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3030.oracle.com with ESMTP id 2wvdtv7ra6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 13 Dec 2019 18:29:58 +0000
Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id xBDITu9w014366; Fri, 13 Dec 2019 18:29:57 GMT
Received: from anon-dhcp-152.1015granger.net (/68.61.232.219) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 13 Dec 2019 10:29:56 -0800
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Chuck Lever <chuck.lever@oracle.com>
In-Reply-To: <50460cda-27d8-6124-b866-f0286e839db0@talpey.com>
Date: Fri, 13 Dec 2019 13:29:54 -0500
Cc: nfsv4@ietf.org, Dave Minturn <dave.b.minturn@intel.com>, Hefty Sean <sean.hefty@intel.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <62793D08-BA40-45E5-A36B-2004931D413D@oracle.com>
References: <MN2PR19MB4045E8323EF839730ED0AEDD83550@MN2PR19MB4045.namprd19.prod.outlook.com> <0A47A2D9-466E-4BA5-9CDC-00A624CB5FF5@oracle.com> <50460cda-27d8-6124-b866-f0286e839db0@talpey.com>
To: Tom Talpey <tom@talpey.com>
X-Mailer: Apple Mail (2.3445.104.11)
X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9470 signatures=668685
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-1912130143
X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9470 signatures=668685
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-1912130143
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/SbI1cCoR44xfNMxDeeAcWPV8Voo>
Subject: Re: [nfsv4] Expert review of NFSv4 RDMA private data draft (draft-ietf-nfsv4-rpcrdma-cm-pvt-data-05)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Dec 2019 18:30:03 -0000


> On Dec 12, 2019, at 8:07 PM, Tom Talpey <tom@talpey.com> wrote:
> 
> [forgot to cc the reviewers]
> 
> On 12/12/2019 5:30 PM, Chuck Lever wrote:
>>> On Dec 12, 2019, at 5:06 PM, Black, David <David.Black@dell.com <mailto:David.Black@dell.com>> wrote:
>>> 
>>> With credit to Magnus (AD responsible for nfsv4 WG) for pinging me behind the scenes, I was able to obtain an expert review (see below) for the RDMA private data draft.
>>> Please keep Dave Minturn  and Sean Hefty on cc: and take it easy on email volume, as they’re doing the WG a favor.
>> My thanks to the reviewers. A few comments inline.
>>> Thanks, --David
>>> *From:*Hefty, Sean <sean.hefty@intel.com <mailto:sean.hefty@intel.com>>
>>> *Sent:*Monday, December 9, 2019 6:08 PM
>>> *To:*Minturn, Dave B <dave.b.minturn@intel.com <mailto:dave.b.minturn@intel.com>>
>>> *Subject:*RE: Question regarding RDMA CM Private Data
>>> Just reading your comments, CM private data is the correct name.  The amount of data present is defined by the spec, with a portion of the data consumed by the RDMA CM protocol itself.  I **think** there’s something like 48 bytes to play with on connection requests.
>> If that 48-byte limit is documented somewhere, this draft can easily reference it. I was never able to find such a document.
> 
> It's not necessary, in my opinion. The limit is different for each
> transport protocol, and all that matters for this RFC is that there
> is "enough" to carry the payload it defines. Any such reference would
> be purely informative, IOW.
> 
>>> I agree that the format specifier is useless. There are many existing users which could be setting the private data to anything.  You really need to rely on the port numbers being correct, with checks on the other fields detecting mismatched protocols.
>> Other reviewers have been ambivalent about the format identifier, but we don't have a better solution at the moment. The problem is the port numbers are all arbitrary. AFAIK they are not used in a way that identify the connection as RPC.
>> The 2049 destination port number means "NFS" but that's a layer /above/ RPC/RDMA.
> 
> This is not the purpose of the identifier. It is there to signal to the
> receiver that the payload is the one defined in the document. If the
> identifier is not present, the receiver will ignore the private data.
> This means the identifier must be defined, and present. It's not meant
> to be universal. Yes, there's a risk that another layer injects its own
> private data, with a payload somehow containing the identifier. That
> might warrant a sentence or two.

Then I think the only modification needed for this document is in
Section 4.1.2. That is fairly young language anyway. We can take
that editorial work back to only nfsv4@ietf.org, and incorporate
the change into the document as the IESG review proceeds.

Everyone OK with that?


>> RPC/RDMA also does not define a ToS. That would be a sure way to distinguish the traffic the connection is to carry.
> 
> Nor do all the RDMA transports. Out of scope.
> 
> Tom.
> 
> 
>>> The private data is and has been exposed directly to user space applications for years, so anything there is fair play.
>>> In order to standardize the private data format, you would need to change the version of the RDMA CM header carried in the private data of the underlying transport.
>> Which is outside the purview of the IETF.
>>> The field sizes for the send/recv sizes look too small to me, unless those fields are scaled somehow.
>> The values in the inline threshold fields are scaled; see Section 5.2. The actual size range of these buffers is 1KB to 256KB.
>>> *From:*Minturn, Dave B <dave.b.minturn@intel.com <mailto:dave.b.minturn@intel.com>>
>>> *Sent:*Monday, December 9, 2019 5:03 PM
>>> *To:*Hefty, Sean <sean.hefty@intel.com <mailto:sean.hefty@intel.com>>
>>> *Subject:*Question regarding RDMA CM Private Data
>>> Hi Sean,
>>> David Black contacted me and asked if I could review an IETF RFC for NFS/RDMA’s use of RDMA CM’s private data.  The RFC reference is: https://tools.ietf.org/pdf/draft-ietf-nfsv4-rpcrdma-cm-pvt-data-05.pdf <https://tools.ietf.org/pdf/draft-ietf-nfsv4-rpcrdma-cm-pvt-data-05..pdf>
>>> It would be great if you could take a look.  I called out some items I saw but you are the expert in this area.
>>> ..Dave
>>> Here are some items that I had questions/concerns with: (Section 4)
>>> /When an RPC-over-RDMA version 1 transport connection is/
>>> /established, the client (which actively establishes connections) and/
>>> /the server (which passively accepts connections) populatethe CM/
>>> /Private Data//field exchanged as part of CM connection establishment./
>>> //
>>> >>  Is “CM Private Data” the proper name?
>>> /For RPC-over-RDMA version 1, the CM Private Data field is formatted/
>>> /as described in the following subsection. RPC clients and servers/
>>> /use the same format. If the capacity of the Private Data field is/
>>> /too small to contain this message format, the underlying RDMA/
>>> /transport is not managed by a Connection Manager,or the underlying/
>>> /RDMA transport uses Private Data for its own purposes,//the CM Private/
>>> /Data field cannot be used on behalf of RPC-over-RDMA version 1./
>>> //
>>> >> This statement seemed unnecessary because it’s transparent to the RFC ULP if the RDMA transport
>>> >> is using Private Data.  Seems more appropriate to say something about the <capacity> of the Private Data
>>> >> field length being RDMA transport dependent.
>>> /0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1/
>>> /+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
>>> /| Format Identifier                                                                                |/
>>> /+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
>>> /| Version           | Flags         | Send Size               | Receive Size      |/
>>> /+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
>>> /Format Identifier: This field contains a fixed 32-bit value that/
>>> /identifies the content of the Private Data field as an RPC-over-/
>>> /RDMA version 1 CM Private Data message. In RPC-over-RDMA version/
>>> /1 Private Data, the value of this field is always 0xf6ab0e18, in/
>>> /network byte order. The use of this field is further expanded/
>>> /upon in Section 4.1./
>>> /4.1.2. Amongst Implementations of Other Upper-Layer Protocols/
>>> /The Format Identifier field in the message format defined in this/
>>> /document is provided to enable implementations to distinguish RPC-/
>>> /over-RDMA version 1 Private Data from application-specific private/
>>> /data inserted by applications other than RPC-over RDMA version 1.///
>>> /Examples of other applications that make use of CM Private Data/
>>> /include iWARP, via the MPA enhancement described in [RFC6581], and/
>>> /iSCSI extensions for RDMA (iSER), as defined in [RFC7145]./
>>> /During connection establishment, an implementation of the extension/
>>> /described in this document checks the Format Identifier field before/
>>> /decoding subsequent fields. If the RPC-over-RDMA version 1 CM/
>>> /Private Data Format Identifier is not present as the first 4 octets,/
>>> /an RPC-over-RDMA version 1 receiver MUST ignore the CM Private Data,/
>>> /behaving as if no RPC-over-RDMA version 1 Private Data has been/
>>> /provided (see above)./
>>> >>  It seems wrong to assume that no other RDMA ULP will put that “magic #” in.   Maybe
>>> >> if combined with a well know port, but even then it’s for sure.  Now that I said that,
>>> >> I found the following text proposing using IANA to standardize the RDMA_CM private data
>>> >>
>>> . IANA Considerations
>>> In accordance with [RFC8126], the author requests that IANA create a
>>> new registry in the "Remote Direct Data Placement" Protocol Category
>>> Group. The new registry is to be called the "RDMA-CM Private Data
>>> Lever Expires May 11, 2020 [Page 8]
>>> Internet-Draft RPC-Over-RDMA CM Private Data November 2019
>>> Identifier Registry".. This is a registry of 32-bit numbers that
>>> identify the upper-layer protocol associated with data that appears
>>> in the application-specific RDMA-CM Private Data area. The fields in
>>> this registry include: Format Identifier, Description, and Reference.
>>> The initial contents of this registry are a single entry:
>>> +------------------+------------------------------------+-----------+
>>> | Format | Format Description | Reference |
>>> | Identifier | | |
>>> +------------------+------------------------------------+-----------+
>>> | 0xf6ab0e18 | RPC-over-RDMA version 1 CM Private | [RFC-TBD] |
>>> | | Data | |
>>> +------------------+------------------------------------+-----------+
>>> Table 1: RDMA-CM Private Data Identifier Registry
>>> IANA is to assign subsequent new entries in this registry using the
>>> Expert Review policy as defined in Section 4.5 of [RFC8126].
>>> _______________________________________________
>>> nfsv4 mailing list
>>> nfsv4@ietf.org <mailto:nfsv4@ietf.org>
>>> https://www.ietf.org/mailman/listinfo/nfsv4
>> --
>> Chuck Lever
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org
>> https://www.ietf.org/mailman/listinfo/nfsv4
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4

--
Chuck Lever