lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090928133219.GA6405@think>
Date:	Mon, 28 Sep 2009 09:32:19 -0400
From:	Chris Mason <chris.mason@...cle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org, jens.axboe@...cle.com, jack@...e.cz
Subject: Re: [PATCH] bdi_sync_writeback should WB_SYNC_NONE first

On Sun, Sep 27, 2009 at 01:34:58AM -0700, Andrew Morton wrote:
> On Fri, 25 Sep 2009 10:10:14 -0400 Chris Mason <chris.mason@...cle.com> wrote:
> 
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index 8e1e5e1..27f8e0e 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -225,7 +225,7 @@ static void bdi_sync_writeback(struct backing_dev_info *bdi,
> >  {
> >  	struct wb_writeback_args args = {
> >  		.sb		= sb,
> > -		.sync_mode	= WB_SYNC_ALL,
> > +		.sync_mode	= WB_SYNC_NONE,
> >  		.nr_pages	= LONG_MAX,
> >  		.range_cyclic	= 0,
> >  	};
> > @@ -236,6 +236,13 @@ static void bdi_sync_writeback(struct backing_dev_info *bdi,
> >  
> >  	bdi_queue_work(bdi, &work);
> >  	bdi_wait_on_work_clear(&work);
> > +
> > +	args.sync_mode = WB_SYNC_ALL;
> > +	args.nr_pages = LONG_MAX;
> > +
> > +	work.state = WS_USED | WS_ONSTACK;
> > +	bdi_queue_work(bdi, &work);
> > +	bdi_wait_on_work_clear(&work);
> >  }
> 
> Those LONG_MAX's are a worry.  What prevents a very long
> almost-livelock from occurring if userspace is concurrently dirtying
> pagecache at a high rate?
> 

In this case, we should be called from unmount.  But, Jens tells me my
patch isn't quite right because even without my patch, the WB_SYNC_ALL
run is queued onto the tail of the list after the WB_SYNC_NONE run.

So, I need to trace it a little better.   My initial theory was that the
nr_dirty number done by the first WB_SYNC_NONE run wasn't big enough.
Once btrfs writepage kicks in, it can make more dirty metadata pages to
close out the delalloc, and if those get written first things could exit
before all the data pages are on disk.

Its a theory anyway, I'll dig in more.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ