Re: [nfsv4] Proposed updates to draft-ietf-nfsv4-rpcrdma-cm-pvt-data

Chuck Lever <chuck.lever@oracle.com> Fri, 20 December 2019 15:34 UTC

Return-Path: <chuck.lever@oracle.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 67FAD1207FD for <nfsv4@ietfa.amsl.com>; Fri, 20 Dec 2019 07:34:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Level:
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=oracle.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1N7L3alSX1XC for <nfsv4@ietfa.amsl.com>; Fri, 20 Dec 2019 07:34:42 -0800 (PST)
Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D6D1812080E for <nfsv4@ietf.org>; Fri, 20 Dec 2019 07:34:41 -0800 (PST)
Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xBKFYSx3021201 for <nfsv4@ietf.org>; Fri, 20 Dec 2019 15:34:41 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : content-type : content-transfer-encoding : mime-version : subject : date : references : to : in-reply-to : message-id; s=corp-2019-08-05; bh=NH62ZLaQ6Rvyod2KtmPz41ih+uj4M/69LGrfLJjvjvE=; b=dxh4Y7PlGZUgowY9mbQ1mXcmLFO0s0SviLkc7wlkT1PUauWwPlPnRddKI+ODQO8J+Hwm ZbTNWFZurMG4tDu+sI337wjlZnG6oryEpBBaeWgaRSLNQZGkMBcuBlQqM9rNRzjXx9sS h0buEJpvdkJwr/wrwwPyDDzKF2QXfswccV5sRrqoiLe+q8fVxG3KaQEFS9Od/43H3xf4 aS5Q7oWOHIO3TeVPT8ngq3RXUzW2QPytA31c4XDYzLowaKixDt/otLQWpbbamjiSVnHL glRvXPBSBNm7RW5XtL7+3RCLARz9qzo1aGsFVb40M02Hc5c8MEFHoAxd3ny2VVrYefC7 Xg==
Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2130.oracle.com with ESMTP id 2x01jahd1c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for <nfsv4@ietf.org>; Fri, 20 Dec 2019 15:34:40 +0000
Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xBKFYU1e064236 for <nfsv4@ietf.org>; Fri, 20 Dec 2019 15:34:40 GMT
Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3030.oracle.com with ESMTP id 2x0vc44929-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for <nfsv4@ietf.org>; Fri, 20 Dec 2019 15:34:40 +0000
Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id xBKFYWFB021584 for <nfsv4@ietf.org>; Fri, 20 Dec 2019 15:34:32 GMT
Received: from anon-dhcp-152.1015granger.net (/68.61.232.219) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 20 Dec 2019 07:34:32 -0800
From: Chuck Lever <chuck.lever@oracle.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
Date: Fri, 20 Dec 2019 10:34:31 -0500
References: <863D1C52-3FCB-47EF-9DEA-4BE8CEF51D6C@oracle.com> <14e44b2a-f6ed-9b7b-2e28-fb4016be173b@talpey.com> <DB9F91F9-455C-4710-949F-01A9BEEE16CA@oracle.com> <c10f5885-d2cf-ad31-89bc-9a2c32fe9248@talpey.com> <CADaq8je8jkUb-CNXnpzjWE6CvdjycEx5q6VOhjtYcKHWx1rbkA@mail.gmail.com> <f97f9d0e-6ae2-77c2-b6be-9c2671567e54@talpey.com> <F19B953A-3E60-4424-9C26-1159C4042209@oracle.com> <1A271061-97A3-4ED0-9F76-63E8D05FACC9@oracle.com> <CADaq8jdrNtekSrCB7V6V2MZCfCQS5nBUXTJAAqOGdw-Q_qbh1g@mail.gmail.com> <fe464eb4-5bcc-c168-8aeb-dd67e605b103@talpey.com>
To: NFSv4 <nfsv4@ietf.org>
In-Reply-To: <fe464eb4-5bcc-c168-8aeb-dd67e605b103@talpey.com>
Message-Id: <BCCEE370-E34F-413E-85BC-0E01717DDBF9@oracle.com>
X-Mailer: Apple Mail (2.3445.104.11)
X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9476 signatures=668685
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-1912200124
X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9476 signatures=668685
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-1912200124
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/nXxNxRY5vH6M1ithxC6iPqUZ3Mc>
Subject: Re: [nfsv4] Proposed updates to draft-ietf-nfsv4-rpcrdma-cm-pvt-data
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Dec 2019 15:34:45 -0000


> On Dec 19, 2019, at 9:28 PM, Tom Talpey <tom@talpey.com> wrote:
> 
> This text looks good, and of course yields the same result regardless
> of how shared private data may be presented.
> 
> Is it worth mentioning that the identifier could be seen at any offset,
> regardless of alignment, and if offset+8 would exceed the buffer, it
> must similarly be ignored?

That's worth mentioning. I'll add it.


> Tom.
> 
> On 12/18/2019 3:56 PM, David Noveck wrote:
>> I'm Ok with this
>> On Wed, Dec 18, 2019, 3:14 PM Chuck Lever <chuck.lever@oracle.com <mailto:chuck.lever@oracle.com>> wrote:
>>     > On Dec 17, 2019, at 11:35 AM, Chuck Lever <chuck.lever@oracle.com
>>    <mailto:chuck.lever@oracle.com>> wrote:
>>     >
>>     >
>>     >
>>     >> On Dec 17, 2019, at 9:12 AM, Tom Talpey <tom@talpey.com
>>    <mailto:tom@talpey.com>> wrote:
>>     >>
>>     >> Dave, I don't agree that the document gets this right, but I defer
>>     >> to the WG if it wishes to proceed.
>>     >
>>     > I agree with Dave's desire to keep the text simple and take the
>>     > point of view of the RPC/RDMA consumer. However, the document
>>     > author would like Tom to be happy with the final text.
>>     >
>>     > Tom, you suggested that you might have some alternate text for
>>     > this section. I'd like to see that before the document proceeds.
>>    Based on feedback from Tom, here is take 2:
>>    4.1.2.  Interoperability Amongst RDMA Transports
>>        The Format Identifier field defined in Section 4 is provided to
>>        enable implementations to distinguish RPC-over-RDMA version 1
>>    Private
>>        Data from private data inserted at other layers, such as the private
>>        data inserted by the iWARP MPAv2 enhancement described in [RFC6581].
>>        As part of connection establishment, the received private data
>>    buffer
>>        is searched for the Format Identifier word.  If the RPC-over-RDMA
>>        version 1 CM Private Data Format Identifier is not present, an RPC-
>>        over-RDMA version 1 receiver MUST behave as if no RPC-over-RDMA
>>        version 1 CM Private Data has been provided.
>>        Once the RPC-over-RDMA version 1 CM Private Data Format
>>    Identifier is
>>        found, the receiver parses the subsequent octets as RPC-over-RDMA
>>        version 1 CM Private Data.  As additional assurance that the private
>>        data content is valid RPC-over-RDMA version 1 CM Private Data, the
>>        receiver should check that the format version number field
>>    contains a
>>        valid and recognized version number and all reserved flag bits are
>>        zero.
>>     >> Tom.
>>     >>
>>     >> On 12/16/2019 11:36 PM, David Noveck wrote:
>>     >>> On Mon, Dec 16, 2019, 3:39 PM Tom Talpey <tom@talpey.com
>>    <mailto:tom@talpey.com> <mailto:tom@talpey.com
>>    <mailto:tom@talpey.com>>> wrote:
>>     >>>   On 12/16/2019 9:52 AM, Chuck Lever wrote:
>>     >>>>
>>     >>>>
>>     >>>>> On Dec 16, 2019, at 9:36 AM, Tom Talpey <tom@talpey.com
>>    <mailto:tom@talpey.com>
>>     >>>   <mailto:tom@talpey.com <mailto:tom@talpey.com>>> wrote:
>>     >>>>>
>>     >>>>> On 12/15/2019 3:48 PM, Chuck Lever wrote:
>>     >>>>>> As a result of expert review, changes are needed to Section
>>    4.1.2
>>     >>>>>> to clarify the purpose and implementation guidance of the Format
>>     >>>>>> Identifier field.
>>     >>>>>> I propose that the new Section read (in its entirety):
>>     >>>>>> 4.1.2.  Amongst Implementations of Other Upper-Layer Protocols
>>     >>>>>
>>     >>>>> That first word is a very odd one in a section title. Would
>>     >>>>> "Interoperability With" be more meaningful?
>>     >>>>
>>     >>>> I will review the titles of the sub-sections here.
>>     >>>>
>>     >>>>
>>     >>>>>>    The Format Identifier field in the message format defined
>>     >>>   in this
>>     >>>>>>    document is provided to enable implementations to
>>     >>>   distinguish RPC-
>>     >>>>>>    over-RDMA version 1 Private Data from private data inserted
>>     >>>   at layers
>>     >>>>>>    below RPC-over-RDMA version 1.  An example of a layer below
>>     >>>   RPC-over-
>>     >>>>>
>>     >>>>> "Below" is problematic here. The RFC6581 MPA enhanced connection
>>     >>>>> processing can insert private data at the start of the field, and
>>     >>>>> it is "below" RPC in the stack, but the peer's RFC6581 processing
>>     >>>>> strips it off. Therefore, of RPC is the "lowest upper layer" in
>>     >>>>> such a stack, there is no issue.
>>     >>>>>
>>     >>>>> However, there might well be other lower layers, with different
>>     >>>>> behaviors, injecting their own private data payloads. Or indeed,
>>     >>>>> upper layers. And these payloads may choose to append, or prepend
>>     >>>>> to the buffer.
>>     >>>>
>>     >>>> OK. Are you requesting a change to this paragraph?
>>     >>>   I think it would be better to avoid introducing a formal
>>    layering, so
>>     >>>   yes. Instead of "layers below", it may be best simply say "other
>>     >>>   layers".
>>     >>>   The next issue makes this clearer:
>>     >>>>>>    RDMA version 1 that makes use of CM Private Data is iWARP,
>>     >>>   via the
>>     >>>>>>    MPA enhancement described in [RFC6581].
>>     >>>>>>    During connection establishment, an implementation of the
>>     >>>   extension
>>     >>>>>>    described in this document checks the Format Identifier
>>     >>>   field before
>>     >>>>>>    decoding subsequent fields.  If the RPC-over-RDMA version
>>    1 CM
>>     >>>>>>    Private Data Format Identifier is not present as the first
>>     >>>   4 octets,
>>     >>>>>
>>     >>>>> So, just to be clear - this introduces a new requirement on other
>>     >>>>> layers over the same connection. They MUST NOT inject private
>>    data
>>     >>>>> at the beginning of the buffer, or if they do, they MUST strip it
>>     >>>>> off.
>>     >>>>>
>>     >>>>>>    an RPC-over-RDMA version 1 receiver MUST ignore the CM
>>     >>>   Private Data,
>>     >>>>>>    behaving as if no RPC-over-RDMA version 1 Private Data
>>    has been
>>     >>>>>>    provided (see above).
>>     >>>>>
>>     >>>>> And, if the prior requirement is not made, then the RPC layer
>>    needs
>>     >>>>> to scan the prvate data rather carefully to see if the
>>    identifier,
>>     >>>>> and the payload associated with it, is present somewhere in the
>>     >>>>> private data.
>>     >>>>
>>     >>>> You might have misread this paragraph. I don't think there's a
>>    need
>>     >>>> for any new requirements: the paragraph states if the first
>>    word in
>>     >>>> the buffer is not the RPC-over-RDMA Format Identifier, then the
>>     >>>> receiver ignores the CM private data.
>>     >>>   Ok, I didn't misread the paragraph, but I drew a different
>>    conclusion.
>>     >>>   If the protocol requires the RPC sender to place its private data
>>     >>>   payload "first", then it is logical to require the receiver
>>    to look
>>     >>>   at it only there. Lower layers may do the same (e.g. the
>>    MPAv2 ird/ord
>>     >>>   exchange), but of course those layers must strip it off
>>    before passing
>>     >>>   the remainder. So at the RPC layer, it is no longer present.
>>     >>>   The draft currently doesn't discuss this type of sharing,
>>     >>> For it to do so is complicated and unnecessary.  The point is
>>    that it is job of the transport protocol to pass along the cm data
>>    without modifying it.
>>     >>>   however, it
>>     >>>   merely states:
>>     >>>        The first 8 octets of the CM Private Data field is to be
>>     >>>   formatted as
>>     >>>        follows:
>>     >>>   In reality, the private data field is a property of the
>>    underlying
>>     >>>   transport,
>>     >>> The private data field on the wire is but rpc-over-rdma never
>>    sees that.  It does see what it received from the transport
>>    implementation via, for example, the ofed interface. For that data
>>    the transport is the custodian rather than the owner.
>>     >>>   and in the case of iWARP/MPAv2 the first bytes are already
>>     >>>   "taken". So, my concern was that the text was somehow stating
>>    that
>>     >>>   the entire private data paylod was to be inspected, when in
>>    reality
>>     >>>   it's just the part which was *not* otherwise taken by other
>>    layers.
>>     >>> If it states that it should be changed, but I read it as
>>    describing what the rpc-over-rdma protocol would see.
>>     >>>> No other scanning is necessary, it's a fail-safe design since the
>>     >>>> CM private data contains only hints.
>>     >>>   Yes, that's understood. Again, it's just a matter of
>>    describing how
>>     >>>   the field is to be found. The CM API may strip off other
>>    non-shared
>>     >>>   private data chunks, If it doesn't, it is seriously broken.
>>     >>>   but what's on the wire will look quite different.
>>     >>> What's on the wire might be encrypted and look/really/
>>    different. However, we don't and shouldn't see any of that.
>>     >>>   So the word "first" is, I find, problematic.
>>     >>>   I'm short on time right now to craft suggested text for this,
>>    but I'll
>>     >>>   take a shot at it hopefully tomorrow.
>>     >>>>>>    Because the Format Identifier field is newer than some other
>>     >>>>>>    potential users of private data (such as iWARP), there is a
>>     >>>   risk that
>>     >>>>>>    a lower layer might inject its own private data with a
>>    payload
>>     >>>>>>    somehow containing the identifier of RPC-over-RDMA version
>>     >>>   1.  It is
>>     >>>>>
>>     >>>>> Well, *and* not strip it off. This could happen if the peer
>>     >>>   implemented
>>     >>>>> a lower layer that wasn't recognized by the receiver, and the
>>    data
>>     >>>>> simply passed up. See previous paragraph.
>>     >>>>>
>>     >>>>>>    recommended that RPC-over-RDMA version 1 implementations
>>     >>>   perform
>>     >>>>>>    additional checks on the content of received CM private
>>     >>>   data before
>>     >>>>>>    making use of it.
>>     >>>>>
>>     >>>>> "Additional checks" is pretty vague. Are there any specific
>>     >>>   requirements?
>>     >>>>
>>     >>>> Would "perform sanity checks on the content" be preferable?
>>     >>>   I was asking what the sanity checks might actually be. Is there a
>>     >>>   specific
>>     >>>   requirement? If not, I don't think the sentence really says
>>    anything,
>>     >>>   and should be dropped.
>>     >>>   What I think is being implied is that after matching the
>>    identifier,
>>     >>>   somehow the range of values in the following structure needs
>>    to be
>>     >>>   checked, and if any of the values fall outside, that the
>>    entire payload
>>     >>>   be ignored. Leaving me to ask, what are these limits?
>>     >>>   Tom.
>>     >>>   Tom.
>>     >>>   _______________________________________________
>>     >>>   nfsv4 mailing list
>>     >>> nfsv4@ietf.org <mailto:nfsv4@ietf.org> <mailto:nfsv4@ietf.org
>>    <mailto:nfsv4@ietf.org>>
>>     >>> https://www.ietf.org/mailman/listinfo/nfsv4
>>     >>> _______________________________________________
>>     >>> nfsv4 mailing list
>>     >>> nfsv4@ietf.org <mailto:nfsv4@ietf.org>
>>     >>> https://www.ietf.org/mailman/listinfo/nfsv4
>>     >>
>>     >> _______________________________________________
>>     >> nfsv4 mailing list
>>     >> nfsv4@ietf.org <mailto:nfsv4@ietf.org>
>>     >> https://www.ietf.org/mailman/listinfo/nfsv4
>>     >
>>     > --
>>     > Chuck Lever
>>     >
>>     >
>>     >
>>     > _______________________________________________
>>     > nfsv4 mailing list
>>     > nfsv4@ietf.org <mailto:nfsv4@ietf.org>
>>     > https://www.ietf.org/mailman/listinfo/nfsv4
>>    --
>>    Chuck Lever
>>    _______________________________________________
>>    nfsv4 mailing list
>>    nfsv4@ietf.org <mailto:nfsv4@ietf.org>
>>    https://www.ietf.org/mailman/listinfo/nfsv4
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org
>> https://www.ietf.org/mailman/listinfo/nfsv4
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4

--
Chuck Lever