[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101214155307.GB8959@localhost>
Date: Tue, 14 Dec 2010 23:53:08 +0800
From: Wu Fengguang <fengguang.wu@...el.com>
To: Trond Myklebust <Trond.Myklebust@...app.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Jan Kara <jack@...e.cz>,
Christoph Hellwig <hch@....de>,
Dave Chinner <david@...morbit.com>,
Theodore Ts'o <tytso@....edu>,
Chris Mason <chris.mason@...cle.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Mel Gorman <mel@....ul.ie>, Rik van Riel <riel@...hat.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Greg Thelen <gthelen@...gle.com>,
Minchan Kim <minchan.kim@...il.com>,
linux-mm <linux-mm@...ck.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 31/35] nfs: dont change wbc->nr_to_write in
write_inode()
On Tue, Dec 14, 2010 at 05:01:44AM +0800, Trond Myklebust wrote:
> On Mon, 2010-12-13 at 22:47 +0800, Wu Fengguang wrote:
> > plain text document attachment
> > (writeback-nfs-commit-remove-nr_to_write.patch)
> > It's introduced in commit 420e3646 ("NFS: Reduce the number of
> > unnecessary COMMIT calls") and seems not necessary.
> >
> > CC: Trond Myklebust <Trond.Myklebust@...app.com>
> > Signed-off-by: Wu Fengguang <fengguang.wu@...el.com>
> > ---
> > fs/nfs/write.c | 9 +--------
> > 1 file changed, 1 insertion(+), 8 deletions(-)
> >
> > --- linux-next.orig/fs/nfs/write.c 2010-12-13 21:46:21.000000000 +0800
> > +++ linux-next/fs/nfs/write.c 2010-12-13 21:46:22.000000000 +0800
> > @@ -1557,15 +1557,8 @@ static int nfs_commit_unstable_pages(str
> > }
> >
> > ret = nfs_commit_inode(inode, flags);
> > - if (ret >= 0) {
> > - if (wbc->sync_mode == WB_SYNC_NONE) {
> > - if (ret < wbc->nr_to_write)
> > - wbc->nr_to_write -= ret;
> > - else
> > - wbc->nr_to_write = 0;
> > - }
> > + if (ret >= 0)
> > return 0;
> > - }
> > out_mark_dirty:
> > __mark_inode_dirty(inode, I_DIRTY_DATASYNC);
> > return ret;
>
> It is there in order to tell the VM that it has succeeded in freeing up
> a certain number of pages. Otherwise, we end up cycling forever in
> writeback_sb_inodes() & friends with the latter not realising that they
> have made progress.
Yeah it seems reasonable, thanks for the explanation. I'll drop it.
The decrease of nr_to_write seems a partial solution. It will return
control to wb_writeback(), however the function may still busy loop
for long time without doing anything, when all the unstable pages are
in-commit pages.
Strictly speaking, over_bground_thresh() should only check the number
of to-commit pages, because the flusher can only commit the to-commit
pages, and can do nothing but wait for the server to response to
in-commit pages. A clean solution would involve breaking up the
current NR_UNSTABLE_NFS into two counters. But you may not like the
side effect that more dirty pages will then be cached in NFS client,
as the background flusher will quit more earlier :)
As a simple fix, I have a patch to avoid such possible busy loop.
Thanks,
Fengguang
---
Subject: writeback: sleep for 10ms when nothing is written
Date: Fri Dec 03 18:31:59 CST 2010
It seems more safe to take a sleep when nothing was done.
NFS background writeback could possibly busy loop in wb_writeback()
when the NFS client has sent and commit all data. It relies on the
NFS server and network condition to get the commit feedback to knock
down the NR_UNSTABLE_NFS number.
CC: Trond Myklebust <Trond.Myklebust@...app.com>
Signed-off-by: Wu Fengguang <fengguang.wu@...el.com>
---
fs/fs-writeback.c | 5 +++++
1 file changed, 5 insertions(+)
--- linux-next.orig/fs/fs-writeback.c 2010-12-03 18:29:14.000000000 +0800
+++ linux-next/fs/fs-writeback.c 2010-12-03 18:31:56.000000000 +0800
@@ -741,6 +741,11 @@ static long wb_writeback(struct bdi_writ
* become available for writeback. Otherwise
* we'll just busyloop.
*/
+ if (list_empty(&wb->b_more_io)) {
+ __set_current_state(TASK_UNINTERRUPTIBLE);
+ io_schedule_timeout(max(HZ/100, 1));
+ continue;
+ }
spin_lock(&inode_lock);
if (!list_empty(&wb->b_more_io)) {
inode = wb_inode(wb->b_more_io.prev);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists