Re: [nfsv4] more on server cache algorithm

"Talpey, Thomas" <Thomas.Talpey@netapp.com> Tue, 18 November 2003 20:43 UTC

Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA14622 for <nfsv4-archive@odin.ietf.org>; Tue, 18 Nov 2003 15:43:25 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMCgk-0006oP-Cx for nfsv4-archive@odin.ietf.org; Tue, 18 Nov 2003 15:43:06 -0500
Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id hAIKh6Sv026179 for nfsv4-archive@odin.ietf.org; Tue, 18 Nov 2003 15:43:06 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMCgk-0006oA-7E for nfsv4-web-archive@optimus.ietf.org; Tue, 18 Nov 2003 15:43:06 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA14605 for <nfsv4-web-archive@ietf.org>; Tue, 18 Nov 2003 15:42:54 -0500 (EST)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AMCgi-0005RA-00 for nfsv4-web-archive@ietf.org; Tue, 18 Nov 2003 15:43:04 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 1AMCgi-0005R7-00 for nfsv4-web-archive@ietf.org; Tue, 18 Nov 2003 15:43:04 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMCgf-0006nV-56; Tue, 18 Nov 2003 15:43:01 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMCgT-0006nH-Q9 for nfsv4@optimus.ietf.org; Tue, 18 Nov 2003 15:42:49 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA14594 for <nfsv4@ietf.org>; Tue, 18 Nov 2003 15:42:38 -0500 (EST)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AMCgS-0005Qz-00 for nfsv4@ietf.org; Tue, 18 Nov 2003 15:42:48 -0500
Received: from mx01.netapp.com ([198.95.226.53]) by ietf-mx with esmtp (Exim 4.12) id 1AMCgR-0005QS-00 for nfsv4@ietf.org; Tue, 18 Nov 2003 15:42:47 -0500
Received: from hawk.corp.netapp.com (hawk [10.10.20.101]) by mx01.netapp.com (8.12.10/8.12.10/NTAP-1.4) with ESMTP id hAIKgHRG024188 for <nfsv4@ietf.org>; Tue, 18 Nov 2003 12:42:17 -0800 (PST)
Received: from svlexc01.hq.netapp.com (svlexc01.corp.netapp.com [10.10.22.171]) by hawk.corp.netapp.com (8.12.9/8.12.9/NTAP-1.5) with ESMTP id hAIKgELn029157 for <nfsv4@ietf.org>; Tue, 18 Nov 2003 12:42:17 -0800 (PST)
Received: from tmt.netapp.com ([10.97.1.30]) by silver.hq.netapp.com with Microsoft SMTPSVC(5.0.2195.5329); Tue, 18 Nov 2003 15:41:18 -0500
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C3AE14.511DEB00"
X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0
content-class: urn:content-classes:message
Subject: Re: [nfsv4] more on server cache algorithm
Message-ID: <5.2.1.1.2.20031118152753.02078ad8@silver.nane.netapp.com>
Thread-Topic: [nfsv4] more on server cache algorithm
Thread-Index: AcOuFFGre+zot4YsQOGu/xBC4nuQKA==
From: "Talpey, Thomas" <Thomas.Talpey@netapp.com>
To: nfsv4@ietf.org
Sender: nfsv4-admin@ietf.org
Errors-To: nfsv4-admin@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
List-Archive: <https://www1.ietf.org/mail-archive/working-groups/nfsv4/>
X-Original-Date: Tue, 18 Nov 2003 12:41:08 -0800
Date: Tue, 18 Nov 2003 12:41:08 -0800

At 02:28 PM 11/18/2003, rick@snowhite.cis.uoguelph.ca wrote:
>It might be possible to create a scale that indicates
>"how dangerous" redoing the various non-idempotent Ops are and calculate
>a "total risk" (based on how many and which non-idempotent Ops are in the
>RPC). Then use that vs "size of reply" to decide whether or not to save it.

While this may be doable, I think it's important to ask if it's useful. The
global-LRU behavior of server caches really adds very little correctness
to the protocol (apologies to Chet Juszczak), because the "window of
correctness" is actually quite small on a busy server. Even smaller when
you consider that a disconnected client probably takes a while to get a
chance to reconnect. And it doesn't cover the case of server failure, where
the cache is usually lost altogether.

All this is addressed as the core of the sessions proposal that Spencer and
I published in May (draft-talpey-nfsv4-rdma-sess-00.txt). The session id is
assigned to a client persistently, until the client revokes it. And the number
of outstanding requests is bounded, so the server can accurately cache the
replies for the session. I envision the server caching them "forever", at least
until some cron job comes and cleans them out every week or so.

This kind of guarantee is really what's needed to really nail the fundamental
requirement here - correctness in the face of client failure, server failure, and
network partitioning. But the framework you describe is a good one as an
enabler for the cache implementation.

Tom.


>(Trouble is, as soon as you decide to not save some non-idempotent replies,
> the cache almost doesn't seem worth the bother. In fact, the V2,3 BSD
> server that has been out there for several years doing NFS over TCP, never
> cached anything for TCP.)
>
>One thing that might get added to the algorithm when I code it is some sort of
>high water mark on "reply bytes saved" that would allow the amount of storage
>used to be tuned, at the expense of a less effective cache.
>
>In general, I see a lot of places where an NFSv4 server can get very memory
>hungry and I am anxious to see how mine works under heavy load. (This summer,
>I am hoping to rig up one of my teaching labs (40-60 machines) with a v4 client
>so I can load test my server. My challenge at the moment is finding a mature
>client I can use for this. If I succeed, others would be welcome to come
>and do testing on it, as well, if they are interested.)
>
>Could be fun, rick
>
>_______________________________________________
>nfsv4 mailing list
>nfsv4@ietf.org
>https://www1.ietf.org/mailman/listinfo/nfsv4