Re: [nfsv4] Fwd: New Version Notification for draft-dnoveck-nfsv4-rpcrdma-rtissues-01.txt

David Noveck <davenoveck@gmail.com> Sat, 17 September 2016 10:15 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A8D8712B071 for <nfsv4@ietfa.amsl.com>; Sat, 17 Sep 2016 03:15:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id a7r7lyxVV4Kp for <nfsv4@ietfa.amsl.com>; Sat, 17 Sep 2016 03:15:01 -0700 (PDT)
Received: from mail-oi0-x22d.google.com (mail-oi0-x22d.google.com [IPv6:2607:f8b0:4003:c06::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6DDC41288B8 for <nfsv4@ietf.org>; Sat, 17 Sep 2016 03:15:01 -0700 (PDT)
Received: by mail-oi0-x22d.google.com with SMTP id w11so139014481oia.2 for <nfsv4@ietf.org>; Sat, 17 Sep 2016 03:15:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=0b1PXZCgy465tDhFpLw7MgRKMa7sWED2ibVRqAZ/+ms=; b=Ee33eF850TB49yfOS1/u9Bd4hm/j6Ak3zveKRSw6IgSFfuO9UbdxYWQBXUyuTQu5mK 1gnN8XJqIg/KVYXFF6YRbBbANz61ZEYtvN6WJ2ufj15RANuRTX1LgMHNiVhYgBCX934L nz6sGfWcS/9DssUphd8tZNc5cMeTPKYnvoTma2sJi2XAUmL1JO9Vgbz7Y0yl2/E7npwz n5CraJf5ibsspBGctnyWtsrXIf2xZj45KE1JgpO0ORF4uBBrL1Z7FyRM8BhktZyiOIor A/iN49iYbygL9gFGvvYe5sfiUJ8db0LEbZ6+Bb1A4qKQx30z5SLkEcgJdWPtXgNcNLiN nYVA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=0b1PXZCgy465tDhFpLw7MgRKMa7sWED2ibVRqAZ/+ms=; b=a5d/Lff6JIhZ32hsWDIQzL+3btTrjXJBKLPREIYlslrvx1WRHikg90Xh+Axa9VsGHO xEPvzxOEnQ2YCrBb34aPUZZrsdRGIN9d7YyQPilKreYEcbdcY/Tmza2RxP46FCwmqb70 BCxnzJlQSFeVbL+Qomuk5bFGDxR8iarQWUXfHX3SKU5JNyvB27nMv/Y8vzIo6seBtRSg LXRJNJvzJaqxFmQgaOKWirwQ93v84QJmA/IjbEmedz3AW7DR5YzTujVHVBPV5M3MFNAw T40bMYzavpTOThdc3awIsZJ4Bu7L01hptyuHzSdGrDgsLuXONgaKDRWYcxZjVXy4xhsh WbmA==
X-Gm-Message-State: AE9vXwNQDC9ZjPHCVJ3HPn166fXqu4TtS0KmlPM832tpaWc4vPS8QM4WXhgdTI/3bvT6PL13gzubqNyBTWSRJg==
X-Received: by 10.202.231.197 with SMTP id e188mr17400622oih.68.1474107300773; Sat, 17 Sep 2016 03:15:00 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.182.192.10 with HTTP; Sat, 17 Sep 2016 03:15:00 -0700 (PDT)
In-Reply-To: <763c7255-9cf3-eece-f7e7-8454a23126a5@oracle.com>
References: <147292013637.2343.7092433187165824743.idtracker@ietfa.amsl.com> <CADaq8jeBaLLKkoSVy8kaBA9k4_6a7PLtEDMyx4zjhDX6U6q6Ow@mail.gmail.com> <234e3071-2b0e-e5a1-f5d5-91919e9388b1@oracle.com> <CADaq8jeP=FJKZAh4GEsogccuCKsoH5=-h7=ymKO1FkRqc=944Q@mail.gmail.com> <763c7255-9cf3-eece-f7e7-8454a23126a5@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Sat, 17 Sep 2016 06:15:00 -0400
Message-ID: <CADaq8jf7DHRptJKMVGacH03-uwGBuyg5pxaGs5V6kHe7oZyYGA@mail.gmail.com>
To: karen deitke <karen.deitke@oracle.com>
Content-Type: multipart/alternative; boundary="001a1141b44a060438053cb15ac3"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/JqG0I7hzfBJCFx5On___-lkQjjo>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] Fwd: New Version Notification for draft-dnoveck-nfsv4-rpcrdma-rtissues-01.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 17 Sep 2016 10:15:04 -0000

> where with the RDMA_WRITE we do not need to wait? (i.e.
> sections 2.2 and 2.3)

Actually, that is why there is not an internode round trip which
contributes to latency.

I do distinguish in  this documentbetween round-trips which do and do not
contribute to latency.  For example, I mention certain acks which re part
of round trips that nobody has to wait for.

I believe that there is no ack for an RDMA write but I've never actually
looked on the wire to see that.  I imagine that there might be certain rare
cases in which some sort of separate ack were sent.  For example, if you
did multiple RDMA write in sequence without a SEND or there were a long
delay without the response being sent, a separate ACK of the RDMA WRITE
might be sent.

However, in the common case, in which a SEND immediately follows, the RDMA
WRITE, there is no need for a separate ACK and I believe the ACK of the
SEND suffices to assure the responder RNIC, that both
previous operations have completed successfully.

On Thu, Sep 15, 2016 at 2:58 PM, karen deitke <karen.deitke@oracle.com>
wrote:

> Thanks,
>
> That clears things up.  Next question, are you saying that an RDMA_WRITE
> is NOT an internode round trip and an RDMA_READ is an internode round trip,
> because we have to wait for the data from the RDMA_READ before proceeding,
> where with the RDMA_WRITE we do not need to wait? (i.e. sections 2.2 and
> 2.3)
>
> Karen
>
>
> On 9/12/2016 2:24 PM, David Noveck wrote:
>
> > I'm confused by this summarization.
>
> :-(.  Let's see what I can do to make this clearer.
>
> > In the text above you indicate 3 different places where "internode round
> trip is involved, yet in the summary you only mention 2.
>
> The point I was trying to make was that, although there were three
> round-trips, only two contribute to the request latency.  In some
> cases, there is a round trip because an ack is sent, but because neither
> the client nor the server is waiting for it.
>
> > What is the definition of an "internode round trip?"
>
> Any situation in which a message is sent in one direction and, after that,
> another message is sent in the opposite direction.
>
> > Also its unclear to me what you mean my "in the context of a connected
> operation".
>
> maybe I should have said, "Because this reliable connected operation in
> which messages are acked.
>
> > Also you mention that there are two-responder-side interrupt latencies,
> are you referring to the notification of the RDMA_READ
> > and the send completion queue for sending the response?
>
> I'm referring to the notification that the request has been received and
> and the notification that the RDMA_READ has comnpleted.
>
> Does this interrupt latency come into play in the latency of the
> operation?
>
> I think the two I mentioned do.
>
> > Once the client side gets the response it can continue, even if the
> server thread is still waiting for notification of a successful send
> correct?
>
> Yes.
>
> > Also are you missing the interrupt latency of the send on the client? In
> addition to the interrupt latency of receiving the reply?
>
> I don't think that contributes to latency.  The request processing can
> continue once the request is received on the server, even if the client
> has not received notification of the completion of the send.
>
> On Mon, Sep 12, 2016 at 3:52 PM, karen deitke <karen.deitke@oracle.com>
> wrote:
>
>> Hi Dave,
>>
>> I'm struggling following this below:
>>
>>    o  First, the memory to be accessed remotely is registered.  This is
>>       a local operation.
>>
>>    o  Once the registration has been done, the initial send of the
>>       request can proceed.  Since this is in the context of connected
>>       operation, there is an internode round trip involved.  However,
>>       the next step can proceed after the initial transmission is
>>       received by the responder.  As a result, only the responder-bound
>>       side of the transmission contributes to overall operation latency.
>>
>>    o  The responder, after being notified of the receipt of the request,
>>       uses RDMA READ to fetch the bulk data.  This involves an internode
>>       round-trip latency.  After the fetch of the data, the responder
>>       needs to be notified of the completion of the explicit RDMA
>>       operation
>>
>>    o  The responder (after performing the requested operation) sends the
>>       response.  Again, as this is in the context of connected
>>       operation, there is an internode round trip involved.  However,
>>       the next step can proceed after the initial transmission is
>>       received by the requester.
>>
>>    o  The memory registered before the request was issued needs to be
>>       deregistered, before the request is considered complete and the
>>       sending process restarted.  When remote invalidation is not
>>       available, the requester, after being notified of the receipt of
>>       the response, performs a local operation to deregister the memory
>>       in question.  Alternatively, the responder will use Send With
>>       Invalidate and the responder's RNIC will effect the deregistration
>>       before notifying the requester of the response which has been
>>       received.
>>
>>    To summarize, if we exclude the actual server execution of the
>>    request, the latency consists of two internode round-trip latencies
>>    plus two-responder-side interrupt latencies plus one requester-side
>>    interrupt latency plus any necessary registration/de-registration
>>    overhead.  This is in contrast to a request not using explicit RDMA
>>    operations in which there is a single inter-node round-trip latency
>>    and one interrupt latency on the requester and the responder.
>>
>> I'm confused by this summarization.  In the text above you indicate 3
>> different places where "internode round trip is involved, yet in the
>> summary you only mention 2.  What is the definition of an "internode round
>> trip?"  Also its unclear to me what you mean my "in the context of a
>> connected operation".
>>
>> Also you mention that there are two-responder-side interupt latencies,
>> are you referring to the notification of the RDMA_READ and the send
>> completion queue for sending the response?  Does this interrupt latency
>> come into play in the latency of the operation? Once the client side gets
>> the response it can continue, even if the server thread is still waiting
>> for notification of a successful send correct?
>>
>> Also are you missing the interrupt latency of the send on the client? In
>> addition to the interrupt latency of receiving the reply?
>>
>> Karen
>>
>>
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org
>> https://www.ietf.org/mailman/listinfo/nfsv4
>>
>
>
>