Re: [storm] Send vs. Immediate

Tom Talpey <tom@talpey.com> Tue, 08 September 2015 14:08 UTC

Return-Path: <tom@talpey.com>
X-Original-To: storm@ietfa.amsl.com
Delivered-To: storm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A5CE61B3CE5 for <storm@ietfa.amsl.com>; Tue, 8 Sep 2015 07:08:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cg3Ns8h7MyPj for <storm@ietfa.amsl.com>; Tue, 8 Sep 2015 07:08:20 -0700 (PDT)
Received: from p3plsmtpa09-03.prod.phx3.secureserver.net (p3plsmtpa09-03.prod.phx3.secureserver.net [173.201.193.232]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2057A1B4993 for <storm@ietf.org>; Tue, 8 Sep 2015 07:08:12 -0700 (PDT)
Received: from [192.168.0.59] ([24.218.177.82]) by p3plsmtpa09-03.prod.phx3.secureserver.net with id Ee8B1r00A1n35Pc01e8Byf; Tue, 08 Sep 2015 07:08:12 -0700
To: Elena Gurevich <elena.gurevich@toganetworks.com>, "storm@ietf.org" <storm@ietf.org>
References: <HE1PR02MB0652B5507B4434A631F6C70CF9550@HE1PR02MB0652.eurprd02.prod.outlook.com> <55EC40AD.4080005@talpey.com> <HE1PR02MB0652931E7D44755A7F27CC95F9540@HE1PR02MB0652.eurprd02.prod.outlook.com> <55EDAC65.2090000@talpey.com> <HE1PR02MB0652965A6EB258603F64CE2DF9530@HE1PR02MB0652.eurprd02.prod.outlook.com>
From: Tom Talpey <tom@talpey.com>
Message-ID: <55EEEBC5.4050906@talpey.com>
Date: Tue, 08 Sep 2015 10:08:05 -0400
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <HE1PR02MB0652965A6EB258603F64CE2DF9530@HE1PR02MB0652.eurprd02.prod.outlook.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/storm/dPQRo9qPciMHgw9H8oiyhF9Lkbk>
Subject: Re: [storm] Send vs. Immediate
X-BeenThere: storm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Storage Maintenance WG <storm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storm>, <mailto:storm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/storm/>
List-Post: <mailto:storm@ietf.org>
List-Help: <mailto:storm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Sep 2015 14:08:23 -0000

On 9/8/2015 1:55 AM, Elena Gurevich wrote:
> Thanks you again,
>
> If so, the statement "Not applicable" in cell "Ordering at Remote Peer" for RDMA Write followed by Send operation in Appendix B of RFC5040 is inaccurate
> And should be changed as "Send is completed after RDMA Write placed and delivered".

I disagree, it is accurate because the table is defining the behavior
as visible from the *local* peer, i.e. the initiator. You will note
that the only messages with anything but "not applicable" are the
RDMA Read responses.

Note the last words of the first sentence:

    Appendix B.  Ordering and Completion Table

    The following table summarizes the ordering relationships that are
    defined in Section 5.5, "Ordering and Completions", from the
    standpoint of the local peer issuing the two Operations.

The state of the *remote* peer depends upon it taking completions,
which is not something the local peer can determine without an
upper layer exchange.

Tom.

>
> Best regards,
> Elena
>
> -----Original Message-----
> From: Tom Talpey [mailto:tom@talpey.com]
> Sent: Monday, September 07, 2015 6:25 PM
> To: Elena Gurevich; storm@ietf.org
> Subject: Re: [storm] Send vs. Immediate
>
>   > So why we need Immediate Data operation and cannot just reuse Send operation ?
>
> Using either one is a choice by the upper layer (RDMA consumer), and it's absolutely possible to just reuse a Send. As mentioned, many upper layers simply use Send. Others, for various reasons, use Immediate data. DDP, and the extended RDMAP (RFC7306) support either one.
>
>
> On 9/7/2015 5:48 AM, Elena Gurevich wrote:
>> Hello Tom, thanks for a prompt response, my question was little bit different.
>>
>> I failed to found differences in behavior between RDMA Write followed by Immediate Data and RDMA Write followed by Send.
>> Both consume untagged buffer of queue #0 - so at data sink this buffer should be preposted in both cases.
>>
>> According to RFCs except DDP message format both operation has to be processed similarly as at data Source as at data Sink.
>>
>> According to ordering and completion rules, specified in RFC 5040 and  7306:
>>
>> First          |Second       | Placement         | Placement       | Ordering
>> Operation| Operation | Guarantee at     | Guarantee at   | Guarantee at
>>                    |                  | Remote Peer     | Local Peer        | Remote Peer
>> -------------+---------------+----------------------+--------------------+----------------
>> RDMA      | Send          | No placement   | Not applicable | Not applicable
>> Write       |                   | guarantee. If     |                         |
>>                   |                   | guarantee is      |                         |
>>                   |                   | necessary, see   |                         |
>>                   |                   | footnote 1.         |                         |
>> -------------+---------------+----------------------+--------------------+------------------
>> RDMA     | Immediate| No Placement   | Not                   | Immediate Data
>> Write      | Data          | Guarantee         | Applicable        | is Completed
>>                  |                   |                           |                          | after RDMA
>>                  |                   |                           |                          | Write is Placed
>>                  |                   |                           |                          | and Delivered
>>
>> But even if the cell  "Ordering Guarantee at Remote Peer" is different for RDMA Write followed by Immediate Data and RDMA Write followed by Send.
>> At the end behavior has to be the same, because of specification of
>> RFC5041 Section 5.4. ( see details in my original e-mail)
>>
>> So why we need Immediate Data operation and cannot just reuse Send operation ?
>>
>> Best regards,
>> Lena
>>
>> -----Original Message-----
>> From: storm [mailto:storm-bounces@ietf.org] On Behalf Of Tom Talpey
>> Sent: Sunday, September 06, 2015 4:34 PM
>> To: storm@ietf.org
>> Subject: Re: [storm] Send vs. Immediate
>>
>> On 9/6/2015 5:58 AM, Elena Gurevich wrote:
>>> Hello dear authors,
>>>
>>> I am new in iWARP and need your help to clear my  misunderstanding of RFCs.
>>>
>>> RFC5041 Section 5.4 stands that:
>>>
>>> "At the Data Sink, DDP MUST Deliver a DDP Message if and only if all
>>> of the following are true:
>>>
>>> * the last DDP Segment of the DDP Message had its Last flag set,
>>>
>>> * all of the DDP Segments of the DDP Message have been Placed,
>>>
>>> * all preceding DDP Messages have been Placed, and
>>>
>>> * each preceding DDP Message has been Delivered to the ULP."
>>>
>>> Let's assume that data sink receives sequence of "rdma_write" and send"
>>> messages.
>>>
>>> According to RFC even if miss ordering happens "send" can be
>>> delivered to RDMAP only after last segment of rdma_write arrives and is placed.
>>>
>>> So RDMAP layer completes "send" only after full rdma_write assembly.
>>
>> These statements are correct, the Send is delivered only after Placing all the previous RDMA Write segments. This is a key behavior upon which most upper layers depend.
>>
>>>
>>> If so why we need to generate new "Immediate request type" and cannot
>>> reuse 8 bytes length "send" ?
>>
>> I don't understand the question. An Immediate is another behavior, and
>> which is used by upper layers rather differently. Also, what
>> "8 bytes" are you referring to?
>>
>> Perhaps you mean, why doesn't the upper layer use an RDMA Write with Immediate? Some upper layers do, I believe Lustre does, for example, but the original iWARP RDMAP (RFC5040) protocol did not support that operation. The RFC7306 extension, published last year, adds it.
>>
>> Other upper layers, for example, iSER, NFS/RDMA, SMB Direct, etc, need a larger, separate message to complete an operation, and so use a full Send. These protocols use iWARP, without Immediates.
>>
> -------------------------------------------------------------------------------------------------------------------------------------------------
> This email and any files transmitted and/or attachments with it are confidential and proprietary information of
> Toga Networks Ltd., and intended solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the system manager. This message contains confidential
> information of Toga Networks Ltd., and is intended only for the individual named. If you are not the named
> addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately
> by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not
> the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on
> the contents of this information is strictly prohibited.
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>
>