linux-kernel - Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090910014201.GB10957@localhost>
Date:	Thu, 10 Sep 2009 09:42:01 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Jan Kara <jack@...e.cz>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Jens Axboe <jens.axboe@...cle.com>,
	Dave Chinner <david@...morbit.com>,
	Chris Mason <chris.mason@...cle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Christoph Hellwig <hch@...radead.org>,
	Artem Bityutskiy <dedekind1@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>
Subject: Re: [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write
	more than dirtied pages

On Wed, Sep 09, 2009 at 11:44:13PM +0800, Jan Kara wrote:
> On Wed 09-09-09 22:51:48, Wu Fengguang wrote:
> > Some filesystem may choose to write much more than ratelimit_pages
> > before calling balance_dirty_pages_ratelimited_nr(). So it is safer to
> > determine number to write based on real number of dirtied pages.
> > 
> > The increased write_chunk may make the dirtier more bumpy.  This is
> > filesystem writers' duty not to dirty too much at a time without
> > checking the ratelimit.
>   I don't get this. balance_dirty_pages_ratelimited_nr() is called when we
> dirty the page, not when we write it out. So a problem would only happen if
> filesystem dirties pages by set_page_dirty() and won't call
> balance_dirty_pages_ratelimited_nr(). But e.g. generic_perform_write()
> and do_wp_page() takes care of that. So where's the problem?

It seems that btrfs_file_write() is writing in chunks of up to 1024-pages
(1024 is the computed nrptrs value in a 32bit kernel). And it calls
balance_dirty_pages_ratelimited_nr() each time it dirtied such a chunk.

Thanks,
Fengguang

> > ---
> >  mm/page-writeback.c |   13 +++++++------
> >  1 file changed, 7 insertions(+), 6 deletions(-)
> > 
> > --- linux.orig/mm/page-writeback.c	2009-09-09 21:19:21.000000000 +0800
> > +++ linux/mm/page-writeback.c	2009-09-09 21:25:45.000000000 +0800
> > @@ -44,12 +44,12 @@ static long ratelimit_pages = 32;
> >  /*
> >   * When balance_dirty_pages decides that the caller needs to perform some
> >   * non-background writeback, this is how many pages it will attempt to write.
> > - * It should be somewhat larger than RATELIMIT_PAGES to ensure that reasonably
> > + * It should be somewhat larger than dirtied pages to ensure that reasonably
> >   * large amounts of I/O are submitted.
> >   */
> > -static inline long sync_writeback_pages(void)
> > +static inline long sync_writeback_pages(unsigned long dirtied)
> >  {
> > -	return ratelimit_pages + ratelimit_pages / 2;
> > +	return dirtied + dirtied / 2;
> >  }
> >  
> >  /* The following parameters are exported via /proc/sys/vm */
> > @@ -476,7 +476,8 @@ get_dirty_limits(unsigned long *pbackgro
> >   * If we're over `background_thresh' then pdflush is woken to perform some
> >   * writeout.
> >   */
> > -static void balance_dirty_pages(struct address_space *mapping)
> > +static void balance_dirty_pages(struct address_space *mapping,
> > +				unsigned long write_chunk)
> >  {
> >  	long nr_reclaimable, bdi_nr_reclaimable;
> >  	long nr_writeback, bdi_nr_writeback;
> > @@ -484,7 +485,6 @@ static void balance_dirty_pages(struct a
> >  	unsigned long dirty_thresh;
> >  	unsigned long bdi_thresh;
> >  	unsigned long pages_written = 0;
> > -	unsigned long write_chunk = sync_writeback_pages();
> >  
> >  	struct backing_dev_info *bdi = mapping->backing_dev_info;
> >  
> > @@ -638,9 +638,10 @@ void balance_dirty_pages_ratelimited_nr(
> >  	p =  &__get_cpu_var(bdp_ratelimits);
> >  	*p += nr_pages_dirtied;
> >  	if (unlikely(*p >= ratelimit)) {
> > +		ratelimit = sync_writeback_pages(*p);
> >  		*p = 0;
> >  		preempt_enable();
> > -		balance_dirty_pages(mapping);
> > +		balance_dirty_pages(mapping, ratelimit);
> >  		return;
> >  	}
> >  	preempt_enable();
> > 
> > -- 
> > 
> -- 
> Jan Kara <jack@...e.cz>
> SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/