[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1290515274-sup-3895@think>
Date: Tue, 23 Nov 2010 07:34:07 -0500
From: Chris Mason <chris.mason@...cle.com>
To: Nick Piggin <npiggin@...nel.dk>
Cc: linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
linux-ext4 <linux-ext4@...r.kernel.org>,
linux-btrfs <linux-btrfs@...r.kernel.org>,
Jan Kara <jack@...e.cz>, Eric Sandeen <sandeen@...hat.com>,
"Theodore Ts'o" <tytso@....edu>
Subject: Re: [patch] fs: fix deadlocks in writeback_if_idle
Excerpts from Nick Piggin's message of 2010-11-23 05:02:39 -0500:
[ ... ]
>
> Avoid both these issues by issuing completely asynchronous writeback request in
> writeback_inodes_sb_if_idle. Don't let that fool you into thinking these
> functions don't suck any more.
>
> ext4 now passes extensive stress testing with xfstests, fs_mark, dbench,
> with a writeback_inodes_if_idle call added directly into ext4_da_write_begin
> to trigger the path frequently. Previously it would spew lockdep stuff and
> hang in a number of ways very quickly.
>
> Signed-off-by: Nick Piggin <npiggin@...nel.dk>
>
> ---
> fs/fs-writeback.c | 32 ++++++++++++++++++++------------
> 1 file changed, 20 insertions(+), 12 deletions(-)
>
> Index: linux-2.6/fs/fs-writeback.c
> ===================================================================
> --- linux-2.6.orig/fs/fs-writeback.c 2010-11-23 20:57:23.000000000 +1100
> +++ linux-2.6/fs/fs-writeback.c 2010-11-23 20:59:10.000000000 +1100
> @@ -1152,16 +1152,17 @@ EXPORT_SYMBOL(writeback_inodes_sb);
> *
> * Invoke writeback_inodes_sb if no writeback is currently underway.
> * Returns 1 if writeback was started, 0 if not.
> + *
> + * Even if 1 is returned, writeback may not be started if memory allocation
> + * fails. This function makes no guarantees about anything.
> */
> int writeback_inodes_sb_if_idle(struct super_block *sb)
> {
> if (!writeback_in_progress(sb->s_bdi)) {
> - down_read(&sb->s_umount);
> - writeback_inodes_sb(sb);
> - up_read(&sb->s_umount);
> + bdi_start_writeback(sb->s_bdi, get_nr_dirty_pages());
I'll put on my skis and channel Christoph for a minute. This isn't
quite the same as the original. writeback_inodes_sb() writes inodes on a
specific super block, while bdi_start_writeback() writes them for any SB
on that bdi.
For btrfs there's only one bdi per SB, but for most everyone else a disk
with a bunch of partitions is going to have multiple filesystems on the
same bdi.
My original btrfs patch just exported the bdi_ funcs so that btrfs could
do the above internally. But Christoph objected, and I think he's
right. We should either give everyone a bdi or make sure the writeback
func kicks only one filesystem.
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists