[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5209A649.90406@redhat.com>
Date: Mon, 12 Aug 2013 22:21:45 -0500
From: Eric Sandeen <sandeen@...hat.com>
To: "Theodore Ts'o" <tytso@....edu>
CC: Dave Chinner <david@...morbit.com>,
Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH 0/5 v2] add extent status tree caching
On 8/12/13 10:10 PM, Dave Chinner wrote:
> On Sat, Aug 03, 2013 at 09:27:40PM -0400, Theodore Ts'o wrote:
>> On Tue, Jul 30, 2013 at 01:08:07PM +1000, Dave Chinner wrote:
>>> But Ted's case is not a "hint" - it's a direct command to fetch the
>>> extent map from disk. You can do that already with FIEMAP, so no new
>>> code or interfaces are needed. fadvise() is not the proper interface
>>> for manipulating filesystem metadata behaviour, and fiemap can
>>> already do what you need. There is no need for any new interfaces
>>> here.
>>
>> I've been looking at the definition of fiemap, and I'm not convinced.
>> To quote from the fiemap.txt:
>>
>> The fiemap ioctl is an efficient method for userspace to get file
>> extent mappings.
Ted -
Changing fiemap.txt is easy, if that's the only problem... :)
>> That's not what is going on here. We are pre-caching them into kernel
>> memory, not in user-space. In addition, we're also setting a flag to
>> keep these extents preferentially in memory compared to other entries
>> in the extent cache.
Reading extents via fiemap almost certainly moves that metadata into
kernel cache, simply by the act of reading the block device to get them.
It doesn't set any caching preference, but how needed is that, really,
in practice?
>> I agree that posix_fadvise() isn't really a good match, either:
>>
>> "posix_fadvise - predeclare an access pattern for file data"
>>
>> How about this? FIEMAP is an ioctl, anyway. How about if we just
>> declare this as a new fs-independent ioctl, much like FS_IOC_FIEMAP?
>>
>> #define FS_IOC_PRECACHE_EXTENTS _IO('f', 18)
>>
>> This is, of course, assuming that other file systems are interested in
>> implementing this functionality. If not, we can just keep it as
>> EXT4_IOC_PRECACHE_EXTENTS, and just call it a day. (We can always add
>> a definition of FS_IOC_PRECACHE_EXTENTS set to ext4 ioctl's code
>> point, at some later point, if people change their minds.)
>
> We *don't need to add any code* to the kernel to read extents into
> the kernel cache. The FIEMAP interface as it exists today -without
> modification- fulfils your stated requirement.
>
> I do no see any reason for adding a new, duplicated interface that
> we have to maintain and hook up to all the relevant filesystems,
> write test code for and then support forever more. It just makes no
> sense at all.
I see Dave's point that we _do_ have an interface today to read
all file extents into cache. We don't mark them as particularly sticky,
however.
This seems pretty clearly driven by a Google workload need; something you
can probably test. Does FIEMAP do the job for you or not? If not, why not?
-Eric
> Cheers,
>
> Dave.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists