[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090730010630.GA7326@localhost>
Date: Thu, 30 Jul 2009 09:06:30 +0800
From: Wu Fengguang <fengguang.wu@...el.com>
To: Martin Bligh <mbligh@...gle.com>
Cc: Chad Talbott <ctalbott@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Michael Rubin <mrubin@...gle.com>,
Andrew Morton <akpm@...gle.com>,
"sandeen@...hat.com" <sandeen@...hat.com>,
Michael Davidson <md@...gle.com>
Subject: Re: Bug in kernel 2.6.31, Slow wb_kupdate writeout
On Wed, Jul 29, 2009 at 10:11:10PM +0800, Martin Bligh wrote:
> > --- mm.orig/fs/fs-writeback.c
> > +++ mm/fs/fs-writeback.c
> > @@ -325,7 +325,8 @@ __sync_single_inode(struct inode *inode,
> > * soon as the queue becomes uncongested.
> > */
> > inode->i_state |= I_DIRTY_PAGES;
> > - if (wbc->nr_to_write <= 0) {
> > + if (wbc->nr_to_write <= 0 ||
> > + wbc->encountered_congestion) {
> > /*
> > * slice used up: queue for next turn
> > */
> >
>
> That's not sufficient - it only the problem in the wb_kupdate path. If you want
> to be more conservative, how about we do this?
I agree on the unification of kupdate and sync paths. In fact I had a
patch for doing this. And I'd recommend to do it in two patches:
one to fix the congestion case, another to do the code unification.
The sync path don't care whether requeue_io() or redirty_tail() is
used, because they disregard the time stamps totally - only order of
inodes matters (ie. starvation), which is same for requeue_io()/redirty_tail().
Thanks,
Fengguang
> --- linux-2.6.30/fs/fs-writeback.c.old 2009-07-29 00:08:29.000000000 -0700
> +++ linux-2.6.30/fs/fs-writeback.c 2009-07-29 07:08:48.000000000 -0700
> @@ -323,43 +323,14 @@ __sync_single_inode(struct inode *inode,
> * We didn't write back all the pages. nfs_writepages(
> )
> * sometimes bales out without doing anything. Redirty
> * the inode; Move it from s_io onto s_more_io/s_dirty.
> + * It may well have just encountered congestion
> */
> - /*
> - * akpm: if the caller was the kupdate function we put
> - * this inode at the head of s_dirty so it gets first
> - * consideration. Otherwise, move it to the tail, for
> - * the reasons described there. I'm not really sure
> - * how much sense this makes. Presumably I had a good
> - * reasons for doing it this way, and I'd rather not
> - * muck with it at present.
> - */
> - if (wbc->for_kupdate) {
> - /*
> - * For the kupdate function we move the inode
> - * to s_more_io so it will get more writeout as
> - * soon as the queue becomes uncongested.
> - */
> - inode->i_state |= I_DIRTY_PAGES;
> - if (wbc->nr_to_write <= 0) {
> - /*
> - * slice used up: queue for next turn
> - */
> - requeue_io(inode);
> - } else {
> - /*
> - * somehow blocked: retry later
> - */
> - redirty_tail(inode);
> - }
> - } else {
> - /*
> - * Otherwise fully redirty the inode so that
> - * other inodes on this superblock will get som
> e
> - * writeout. Otherwise heavy writing to one
> - * file would indefinitely suspend writeout of
> - * all the other files.
> - */
> - inode->i_state |= I_DIRTY_PAGES;
> + inode->i_state |= I_DIRTY_PAGES;
> + if (wbc->nr_to_write <= 0 || /* sliced used up */
> + wbc->encountered_congestion)
> + requeue_io(inode);
> + else {
> + /* somehow blocked: retry later */
> redirty_tail(inode);
> }
> } else if (inode->i_state & I_DIRTY) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists