[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 25 Nov 2009 15:48:44 -0500
From: "J. Bruce Fields" <bfields@...ldses.org>
To: tytso@....edu
Cc: Trond Myklebust <trond.myklebust@....uio.no>,
linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: i_version, NFSv4 change attribute
On Mon, Nov 23, 2009 at 01:51:05PM -0500, tytso@....edu wrote:
> Now, all of this having been said, Feodra 11 and 12 have been using
> ext4 as the default filesystem, and for generic desktop usage, people
> haven't been screaming about the increased CPU overhead implied by
> engaging the jbd2 machinery on every sys_write().
>
> However, we have had a report that some enterprise database developers
> have noticed the increased overhead in ext4, and this is on our list
> of things that require some performance tuning. Hence my comments
> about a mount option to adjust s_time_gran for the benefit of database
> workloads, and once we have that moun option, since enabling i_version
> would mean once again needing to update the inode at every single
> write(2) call, we would be back with the same problem.
>
> Maybe we can find a way to be more clever about doing some (but not
> all) of the jbd2 work on each sys_write(), and deferring as much as
> possible to the commit handling. We need to do some investigating to
> see if that's possible. Even if it isn't, though, my gut tells me
> that we will probably be able to enable i_version by default for
> desktop workloads, and tell database server folks that they should
> mount with the mount options "noi_version,time_gran=1s", or some such.
>
> I'd like to do some testing to confirm my intuition first, of course,
> but that's how I'm currently leaning. Does that make sense?
I think so, thanks.
So do I have this todo list approximately right?:
1. Use an atomic type instead of a spinlock for i_version, and
do some before-and-after benchmarking of writes (following your
suggestions in
http://marc.info/?l=linux-ext4&m=125900130605891&w=2)
2. Turn on i_version by default. (At this point it shouldn't be
making things any worse than the high-resolution timestamps
are.)
3. Find someone to run database benchmarks, and work on
noi_version,time_gran=1s (or whatever) options for their case.
I wish I could volunteer at least for #1, but embarassingly don't have
much more than dual-core machines lying around right now to test with.
--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists