linux-kernel - Re: [PATCH 01/35] writeback: enabling gate limit for light dirtied bdi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Fri, 14 Jan 2011 11:21:22 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Jan Kara <jack@...e.cz>, Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Christoph Hellwig <hch@....de>,
	Trond Myklebust <Trond.Myklebust@...app.com>,
	Dave Chinner <david@...morbit.com>,
	Theodore Ts'o <tytso@....edu>,
	Chris Mason <chris.mason@...cle.com>,
	Mel Gorman <mel@....ul.ie>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Greg Thelen <gthelen@...gle.com>,
	Minchan Kim <minchan.kim@...il.com>,
	linux-mm <linux-mm@...ck.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 01/35] writeback: enabling gate limit for light dirtied
 bdi

On Fri, Jan 14, 2011 at 03:26:10AM +0800, Peter Zijlstra wrote:
> On Thu, 2011-01-13 at 11:44 +0800, Wu Fengguang wrote:
> > When testing 10-disk JBOD setup, I
> > find that bdi_dirty_limit fluctuations too much. So I'm considering
> > use global_dirty_limit as control target. 
> 
> Is this because the bandwidth is equal or larger than the dirty period?

The patchset will call ->writepages(N) with
N=rounddown_pow_of_two(bdi->write_bandwidth). XFS will then typically
do endio batches in the same amount. I see in practice XFS's
xfs_end_io() work get queued and executed ~2 times per second,
normally clearing 32MB worth of PG_writeback. I guess this is one
major source of fluctuation.

The attached XFS graphs can confirm this. The "written" and
"writeback" curves are skipping at 32MB size.

As for the dirty period,

        calc_period_shift()
        = 2 + ilog2(dirty_total - 1)
        = 2 + ilog2(380000)             # a 8GB test box, 20% dirty_ratio
        = 19

So period = (1 << 18) = 256k pages = 1GB. It's much larger than 32MB.
(Please correct me if wrong).

The problem is not limited to XFS. ext2/ext3/ext4 are also fluctuating
in a range up to bdi->write_bandwidth.

http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/16G-10HDD-JBOD/ext2-fio-jbod-sync-128k-24p-15977M-2.6.37-rc8-dt5+-2010-12-31-19-36/balance_dirty_pages-pages.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/16G-10HDD-JBOD/ext4_wb-fio-jbod-sync-128k-24p-15977M-2.6.37-rc8-dt5+-2010-12-31-12-24/balance_dirty_pages-pages.png

I noticed (ext2/ext3 graphs attached) that they are clearing
PG_writeback in much smaller batches at least in 1-disk case.
However the writeback pages can go low for 1-2 times in every 10
seconds.

Thanks,
Fengguang

Download attachment "xfs-1dd-1M-1p-2970M-global_dirty_state-500.png" of type "image/png" (55470 bytes)

Download attachment "xfs-2dd-1M-1p-2970M-global_dirtied_written-500.png" of type "image/png" (49963 bytes)

Download attachment "ext3-1dd-1M-1p-2970M-global_dirty_state-500.png" of type "image/png" (114311 bytes)

Download attachment "ext2-1dd-1M-1p-2970M-global_dirtied_written-500.png" of type "image/png" (53505 bytes)