[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090923012700.GA10464@localhost>
Date: Wed, 23 Sep 2009 09:27:00 +0800
From: Wu Fengguang <fengguang.wu@...el.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Chris Mason <chris.mason@...cle.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"Li, Shaohua" <shaohua.li@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"richard@....demon.co.uk" <richard@....demon.co.uk>,
"jens.axboe@...cle.com" <jens.axboe@...cle.com>
Subject: Re: regression in page writeback
On Wed, Sep 23, 2009 at 09:17:58AM +0800, Wu Fengguang wrote:
> On Wed, Sep 23, 2009 at 08:54:52AM +0800, Andrew Morton wrote:
> > On Wed, 23 Sep 2009 08:22:20 +0800 Wu Fengguang <fengguang.wu@...el.com> wrote:
> >
> > > Jens' per-bdi writeback has another improvement. In 2.6.31, when
> > > superblocks A and B both have 100000 dirty pages, it will first
> > > exhaust A's 100000 dirty pages before going on to sync B's.
> >
> > That would only be true if someone broke 2.6.31. Did they?
> >
> > SYSCALL_DEFINE0(sync)
> > {
> > wakeup_pdflush(0);
> > sync_filesystems(0);
> > sync_filesystems(1);
> > if (unlikely(laptop_mode))
> > laptop_sync_completion();
> > return 0;
> > }
> >
> > the sync_filesystems(0) is supposed to non-blockingly start IO against
> > all devices. It used to do that correctly. But people mucked with it
> > so perhaps it no longer does.
>
> I'm referring to writeback_inodes(). Each invocation of which (to sync
> 4MB) will do the same iteration over superblocks A => B => C ... So if
> A has dirty pages, it will always be served first.
>
> So if wbc->bdi == NULL (which is true for kupdate/background sync), it
> will have to first exhaust A before going on to B and C.
>
> There are no "cursor" in the superblock level iterations.
I even have an old patch for it. But Jens' patches are more general solution.
Thanks,
Fengguang
---
writeback: continue from the last super_block in syncing
Cc: David Chinner <dgc@....com>
Cc: Michael Rubin <mrubin@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Signed-off-by: Fengguang Wu <wfg@...l.ustc.edu.cn>
---
fs/fs-writeback.c | 12 ++++++++++++
include/linux/writeback.h | 2 ++
2 files changed, 14 insertions(+)
--- linux-2.6.orig/fs/fs-writeback.c
+++ linux-2.6/fs/fs-writeback.c
@@ -494,11 +494,19 @@ void
writeback_inodes(struct writeback_control *wbc)
{
struct super_block *sb;
+ int i;
+
+ if (wbc->sb_index)
+ wbc->more_io = 1;
might_sleep();
spin_lock(&sb_lock);
restart:
+ i = -1;
list_for_each_entry_reverse(sb, &super_blocks, s_list) {
+ i++;
+ if (i < wbc->sb_index)
+ continue;
if (sb_has_dirty_inodes(sb)) {
/* we're making our own get_super here */
sb->s_count++;
@@ -520,9 +528,13 @@ restart:
if (__put_super_and_need_restart(sb))
goto restart;
}
+ if (list_empty(&sb->s_io))
+ wbc->sb_index++;
if (wbc->nr_to_write <= 0)
break;
}
+ if (&sb->s_list == &super_blocks)
+ wbc->sb_index = 0;
spin_unlock(&sb_lock);
}
--- linux-2.6.orig/include/linux/writeback.h
+++ linux-2.6/include/linux/writeback.h
@@ -48,6 +48,8 @@ struct writeback_control {
this for each page written */
long pages_skipped; /* Pages which were not written */
+ int sb_index; /* the superblock to continue from */
+
/*
* For a_ops->writepages(): is start or end are non-zero then this is
* a hint that the filesystem need only write out the pages inside that
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists