Re: [nfsv4] Fwd: New Version Notification for draft-dnoveck-nfsv4-rpcrdma-rtrext-02.txt

Christoph Hellwig <hch@lst.de> Tue, 06 June 2017 15:42 UTC

Return-Path: <hch@lst.de>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7772E129549 for <nfsv4@ietfa.amsl.com>; Tue, 6 Jun 2017 08:42:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RP_MATCHES_RCVD=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id koGnAvflpb4S for <nfsv4@ietfa.amsl.com>; Tue, 6 Jun 2017 08:42:23 -0700 (PDT)
Received: from newverein.lst.de (verein.lst.de [213.95.11.211]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 91000129515 for <nfsv4@ietf.org>; Tue, 6 Jun 2017 08:42:23 -0700 (PDT)
Received: by newverein.lst.de (Postfix, from userid 2407) id F103068AFE; Tue, 6 Jun 2017 17:42:21 +0200 (CEST)
Date: Tue, 06 Jun 2017 17:42:21 +0200
From: Christoph Hellwig <hch@lst.de>
To: David Noveck <davenoveck@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>, "Black, David" <David.Black@dell.com>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Message-ID: <20170606154221.GA13918@lst.de>
References: <149667468294.3266.7785272769313517872.idtracker@ietfa.amsl.com> <CADaq8jcf1zM5LGJngjT5q-FKxqVGJa-U_yyCdG78NDETp7-k3w@mail.gmail.com> <20170606064252.GA14844@lst.de> <CADaq8jckXDOXp9p3266OznMCSu9=VWV7FX7wJp=V54P8ONTOUQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CADaq8jckXDOXp9p3266OznMCSu9=VWV7FX7wJp=V54P8ONTOUQ@mail.gmail.com>
User-Agent: Mutt/1.5.17 (2007-11-01)
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/giXv9Z_UiMKtG0Q9pc3VtF0RNfc>
Subject: Re: [nfsv4] Fwd: New Version Notification for draft-dnoveck-nfsv4-rpcrdma-rtrext-02.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Jun 2017 15:42:25 -0000

On Tue, Jun 06, 2017 at 10:32:50AM -0400, David Noveck wrote:
> I'm clear that copies will kill performance, but I think you are wrong
> about this
> requiring copies.  The idea here is that the client will post as read
> buffers, aligned
> cache buffers not recently used.

What cache buffer?  The destination of a READ I/O might be any piece
of memory in the system, including user memory.  And even for the page
cache most modern OSes need to allocate the page first before doing
I/O as it's the synchronization object.

Your idea might work in the context of a system that only uses buffered
I/O, uses a fixed buffer cache like a Unix system designed in the 70s,
and lets you heavily hack^H^H^H^Hoptimize the core code for your
super specialized use case.  It ain't gonna happen in practice.

> assigned to receiving block-aligned data.   This is basically the same as
> the case in
> wich the client picks less-recently used buffer to use when reading from
> disk.

It's nothing at all like a page cache in the sense how most OSes use
it - we usually don't directly recycle 'buffers' but pages from the
general system allocator.  And they are page size (usually 4k or 64k(
and might not fit your I/O size at all.

If you want send based data placement to work you need tagged read
buffers so that the HCA uses exactly the read buffer your want, something
that is not supported by the current Verbs API, and not by any HCA
I know off.