lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101214155307.GB8959@localhost>
Date:	Tue, 14 Dec 2010 23:53:08 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Trond Myklebust <Trond.Myklebust@...app.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>, Jan Kara <jack@...e.cz>,
	Christoph Hellwig <hch@....de>,
	Dave Chinner <david@...morbit.com>,
	Theodore Ts'o <tytso@....edu>,
	Chris Mason <chris.mason@...cle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Mel Gorman <mel@....ul.ie>, Rik van Riel <riel@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Greg Thelen <gthelen@...gle.com>,
	Minchan Kim <minchan.kim@...il.com>,
	linux-mm <linux-mm@...ck.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 31/35] nfs: dont change wbc->nr_to_write in
 write_inode()

On Tue, Dec 14, 2010 at 05:01:44AM +0800, Trond Myklebust wrote:
> On Mon, 2010-12-13 at 22:47 +0800, Wu Fengguang wrote:
> > plain text document attachment
> > (writeback-nfs-commit-remove-nr_to_write.patch)
> > It's introduced in commit 420e3646 ("NFS: Reduce the number of
> > unnecessary COMMIT calls") and seems not necessary.
> > 
> > CC: Trond Myklebust <Trond.Myklebust@...app.com>
> > Signed-off-by: Wu Fengguang <fengguang.wu@...el.com>
> > ---
> >  fs/nfs/write.c |    9 +--------
> >  1 file changed, 1 insertion(+), 8 deletions(-)
> > 
> > --- linux-next.orig/fs/nfs/write.c	2010-12-13 21:46:21.000000000 +0800
> > +++ linux-next/fs/nfs/write.c	2010-12-13 21:46:22.000000000 +0800
> > @@ -1557,15 +1557,8 @@ static int nfs_commit_unstable_pages(str
> >  	}
> >  
> >  	ret = nfs_commit_inode(inode, flags);
> > -	if (ret >= 0) {
> > -		if (wbc->sync_mode == WB_SYNC_NONE) {
> > -			if (ret < wbc->nr_to_write)
> > -				wbc->nr_to_write -= ret;
> > -			else
> > -				wbc->nr_to_write = 0;
> > -		}
> > +	if (ret >= 0)
> >  		return 0;
> > -	}
> >  out_mark_dirty:
> >  	__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
> >  	return ret;
> 
> It is there in order to tell the VM that it has succeeded in freeing up
> a certain number of pages. Otherwise, we end up cycling forever in
> writeback_sb_inodes() & friends with the latter not realising that they
> have made progress.

Yeah it seems reasonable, thanks for the explanation.  I'll drop it.

The decrease of nr_to_write seems a partial solution. It will return
control to wb_writeback(), however the function may still busy loop
for long time without doing anything, when all the unstable pages are
in-commit pages.

Strictly speaking, over_bground_thresh() should only check the number
of to-commit pages, because the flusher can only commit the to-commit
pages, and can do nothing but wait for the server to response to
in-commit pages. A clean solution would involve breaking up the
current NR_UNSTABLE_NFS into two counters. But you may not like the
side effect that more dirty pages will then be cached in NFS client,
as the background flusher will quit more earlier :)

As a simple fix, I have a patch to avoid such possible busy loop.

Thanks,
Fengguang
---

Subject: writeback: sleep for 10ms when nothing is written
Date: Fri Dec 03 18:31:59 CST 2010

It seems more safe to take a sleep when nothing was done.

NFS background writeback could possibly busy loop in wb_writeback()
when the NFS client has sent and commit all data. It relies on the
NFS server and network condition to get the commit feedback to knock
down the NR_UNSTABLE_NFS number.

CC: Trond Myklebust <Trond.Myklebust@...app.com>
Signed-off-by: Wu Fengguang <fengguang.wu@...el.com>
---
 fs/fs-writeback.c |    5 +++++
 1 file changed, 5 insertions(+)

--- linux-next.orig/fs/fs-writeback.c	2010-12-03 18:29:14.000000000 +0800
+++ linux-next/fs/fs-writeback.c	2010-12-03 18:31:56.000000000 +0800
@@ -741,6 +741,11 @@ static long wb_writeback(struct bdi_writ
 		 * become available for writeback. Otherwise
 		 * we'll just busyloop.
 		 */
+		if (list_empty(&wb->b_more_io)) {
+			__set_current_state(TASK_UNINTERRUPTIBLE);
+			io_schedule_timeout(max(HZ/100, 1));
+			continue;
+		}
 		spin_lock(&inode_lock);
 		if (!list_empty(&wb->b_more_io))  {
 			inode = wb_inode(wb->b_more_io.prev);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ