[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DD7C0279-50CF-443B-B61B-D3DD78EE22C5@oracle.com>
Date: Fri, 7 Dec 2007 12:59:48 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: David Howells <dhowells@...hat.com>
Cc: Peter Staubach <staubach@...hat.com>,
Trond Myklebust <trond.myklebust@....uio.no>,
nfsv4@...ux-nfs.org, linux-kernel@...r.kernel.org
Subject: Re: How to manage shared persistent local caching (FS-Cache) with NFS?
Hi David-
[ Some history snipped... ]
On Dec 6, 2007, at 3:00 PM, David Howells wrote:
> Chuck Lever <chuck.lever@...cle.com> wrote:
>> Is it a problem because, if there are multiple copies of the same
>> remote file
>> in its cache, then FS-cache doesn't know, upon reconnection,
>> which item to
>> match against a particular remote file?
>
> There are multiple copies of the same remote file that are
> described by the
> same remote parameters. Same IP address, same port, same NFS
> version, same
> FSID, same FH. The difference may be a local connection parameter.
Why not encode the local mounted-on directory in the key? A
cryptographic hash of the directory's absolute pathname would be
bounded in size. And the mounted-on directory is usually persistent
across client reboots.
That way you can use the directory name hash to distinguish the
different views of the same remote object.
>> An adequate first pass at FS-cache can be done without guaranteeing
>> persistence.
>
> True. But it's not particularly interesting to me in such a case.
>
>> There are a host of other issues that need exposure -- steady-state
>> performance;
>
> Meaning what?
Meaning your cache is at quota all the time, and to continue
operation it must eject items constantly.
This is a scenario where it pays to cache the read-mostly items on
disk, and leave the frequently changing items in memory.
The economics of disk caches is different than memory caches. Disk
caches are much larger and cheaper, but their performance tanks when
they have to track frequently changing files. Memory caches are
smaller, but tracking frequently changing data is only a little more
expensive than tracking data that doesn't change often.
> I have been measuring the performance improvement and degradation
> numbers, and
> I can say that if you've one client and one server, the server has
> all the
> files in memory, and there's gigabit ethernet between them, an on-
> disk cache
> really doesn't help.
>
> Basically, the consideration of whether to use a cache is a
> compromise between
> a host of factors.
>
>> cache garbage collection
>
> Done.
>
>> and reclamation;
>
> Done.
>
>> cache item aliasing;
>
> Partly done.
>
>> whether all files on a mount point should be cached on disk, or
>> some in
>> memory and some on disk;
>
> I've thought about that, but no-one seems particularly interested in
> discussing it.
I think it's key to preventing FS-cache from making performance worse
in many common scenarios.
>> And what would it harm if FS-cache decides that certain items in
>> its cache
>> have become ambiguous or otherwise unusable after a reconnection
>> event, thus
>> it reclaims them instead of re-using them?
>
> It depends.
>
> At some point I'd like to make disconnected operation possible, and
> that means
> storing data to be written back in the cache. You can't
> necessarily just
> chuck that away.
Disconnected operation for NFS is fraught with challenges. Access to
data on servers is traditionally gated by the client's IP address,
for example. The client may disconnect from the network, then
reconnect using a different address where suddenly all of its
accesses are rebuffed.
NFS servers, not clients, traditionally determine the file's mtime
and ctime, and its file handle. So file updates and file creation
become problematic. The client has to reconcile the server's file
handle, for files created offline, with its own when reconnecting.
And, for disconnected operation, the cache is required to contain
every item from the remote. You can't just drop items from the cache
because they are inconvenient.
>>> I can't just say: "Well, it'll oops if you configure your NFS
>>> shares like
>>> that,
>>> so don't. It's not worth me implementing round it.".
>>
>> What causes that instability? Why can't you insulate against the
>> instability
>> but allow cache incoherence and aliased cache items?
>
> Insulate how? The only way to do that is to add something to the
> cache key
> that says that these two otherwise identical items are actually
> diffent
> things.
That something might be the pathname of the mounted-on directory or
of the file itself.
>> I'm arguing that cache coherence isn't supported by the NFS
>> protocol, so how
>> can FS-cache *require* a facility to support persistent local
>> caching that
>> the protocol doesn't have in the first place?
>
> NFS has just enough to just about support a persistent local cache for
> unmodified files. It has unique file keys per server, and it has a
> (limited)
> amount of coherency data per file. That's not really the problem.
>
> The problem is that the client can create loads of different views
> of a remote
> export and the kernel treats them as if they're views of different
> remote
> exports. These views do not necessarily have *anything* to
> distinguish them
> at all (nosharecache option).
Yes, they do. The combination of mount options and mounted-on
directory (or local pathname to the file) gives you a unique identity
for that view.
> Now, for the case of cached clients, we can enforce a reduction of
> incoherency
> by requiring one remote inode maps to a single client inode if that
> inode is
> going to be placed in the persistent cache.
That seems reasonable. Just don't cache the second and greater
instances of the same remote file if FS-cache can't handle local
aliases.
>> Invalidating is cheap for in-memory caches. Frequent invalidation
>> is going
>> to be expensive for FS-cache, since it requires some disk I/O (and
>> perhaps
>> even file truncation).
>
> So what? That's one of the compromises you have to make if you
> want an
> on-disk cache. The invalidation is asynchronous anyway.
So an item is cached in memory until space becomes available in the
disk cache?
>
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists