lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110307213441.GA22237@localhost>
Date:	Mon, 7 Mar 2011 13:34:41 -0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Jan Kara <jack@...e.cz>, Christoph Hellwig <hch@....de>,
	Trond Myklebust <Trond.Myklebust@...app.com>,
	Dave Chinner <david@...morbit.com>,
	Theodore Ts'o <tytso@....edu>,
	Chris Mason <chris.mason@...cle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Mel Gorman <mel@....ul.ie>, Rik van Riel <riel@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Greg Thelen <gthelen@...gle.com>,
	Minchan Kim <minchan.kim@...il.com>,
	Vivek Goyal <vgoyal@...hat.com>,
	Andrea Righi <arighi@...eler.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	linux-mm <linux-mm@...ck.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 19/27] writeback: dirty throttle bandwidth control

On Thu, Mar 03, 2011 at 02:45:24PM +0800, Wu, Fengguang wrote:
> balance_dirty_pages() has been using a very simple and robust threshold
> based throttle scheme. It automatically limits the dirty rate down,
> however in a very bumpy way that constantly block the dirtier tasks for
> hundreds of milliseconds on a local ext4.

To get an idea of what exactly is going on in the current kernel, I
back ported the balance_dirty_pages and global_page_state trace events
to 2.6.38-rc7 and run the same test cases. The resulted graphs are
pretty striking.

In the worst NFS cases, the pause time frequently go up to 20-30 seconds,
and the dirty progress is rather bumpy.

1-dd case
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/NFS/nfs-1dd-1M-8p-2945M-20%25-2.6.38-rc7+-2011-03-07-23-14/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/NFS/nfs-1dd-1M-8p-2945M-20%25-2.6.38-rc7+-2011-03-07-23-14/global_dirtied_written.png

8-dd case
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/NFS/nfs-8dd-1M-8p-2945M-20%25-2.6.38-rc7+-2011-03-07-23-26/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/NFS/nfs-8dd-1M-8p-2945M-20%25-2.6.38-rc7+-2011-03-07-23-26/balance_dirty_pages-task-bw.png

The writes to USB key starts with a long 30 seconds pause, followed by
many ~2 seconds long pauses for ext4. XFS is better; btrfs performs the
best, however can still have 7s and 2s long delays.

http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/1UKEY+1HDD-3G/ext4-1dd-1M-8p-2945M-20%25-2.6.38-rc7+-2011-03-07-23-34/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/1UKEY+1HDD-3G/xfs-1dd-1M-8p-2945M-20%25-2.6.38-rc7+-2011-03-07-23-56/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/1UKEY+1HDD-3G/btrfs-1dd-1M-8p-2945M-20%25-2.6.38-rc7+-2011-03-08-00-14/balance_dirty_pages-pause.png

For the normal writes to HDD, ext4 has some >300ms pause times in 1-dd
case, >600ms for 2-dd case, and >2s for 8-dd case. The pause time
roughly deteriorates proportionally with the number of concurrent dd tasks.

http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/4G/ext4-1dd-1M-8p-3911M-20%25-2.6.38-rc7+-2011-03-07-22-15/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/4G/ext4-2dd-1M-8p-3911M-20%25-2.6.38-rc7+-2011-03-07-22-22/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/4G/ext4-8dd-1M-8p-3911M-20%25-2.6.38-rc7+-2011-03-07-22-30/balance_dirty_pages-pause.png

XFS performs similarly
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/4G/xfs-8dd-1M-8p-3911M-20%25-2.6.38-rc7+-2011-03-07-22-08/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/4G/xfs-8dd-1M-8p-3911M-20%25-2.6.38-rc7+-2011-03-07-22-08/balance_dirty_pages-task-bw.png

btrfs is better, typically has 1-2s max pause time in 8-dd case
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/4G/btrfs-8dd-1M-8p-3911M-20%25-2.6.38-rc7+-2011-03-07-21-48/balance_dirty_pages-pause.png

The long pause times will obviously ruin user experiences. It may also
hurt performance. For example, if the dirtier is a simple "cp" or
"scp", the long pause time will break the readahead pipeline or the
network pipeline, leading to moments of underutilized disk/network
bandwidth.

Comparing to the above graphs, this patchset is able to keep latency
under control (less than the configured 200ms max pause time) in all
known cases, whether it be 1-dd or 1000-dd, on local file systems,
over NFS or on USB key.

8-dd case
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/vanilla/4G/ext4-8dd-1M-8p-3911M-20%25-2.6.38-rc7+-2011-03-07-22-30/balance_dirty_pages-task-bw.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/4G/xfs-8dd-1M-8p-3927M-20%25-2.6.38-rc6-dt6+-2011-02-27-23-18/balance_dirty_pages-task-bw.png

128-dd case
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/4G/xfs-128dd-1M-8p-3927M-20%25-2.6.38-rc6-dt6+-2011-02-27-23-25/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/4G/xfs-128dd-1M-8p-3927M-20%25-2.6.38-rc6-dt6+-2011-02-27-23-25/balance_dirty_pages-task-bw.png

1000-dd case
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/10SSD-RAID0-64G/xfs-1000dd-1M-64p-64288M-20%25-2.6.38-rc6-dt6+-2011-02-28-10-40/balance_dirty_pages-pause.png

UKEY
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/1UKEY+1HDD-3G/ext4-1dd-1M-8p-2975M-20%25-2.6.38-rc6-dt6+-2011-02-28-20-21/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/1UKEY+1HDD-3G/xfs-4dd-1M-8p-2945M-20%25-2.6.38-rc5-dt6+-2011-02-22-09-27/balance_dirty_pages-pause.png

NFS
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/NFS/nfs-1dd-1M-8p-2945M-20%25-2.6.38-rc6-dt6+-2011-02-22-21-09/balance_dirty_pages-pause.png
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/NFS/nfs-8dd-1M-8p-2945M-20%25-2.6.38-rc6-dt6+-2011-02-22-21-22/balance_dirty_pages-pause.png

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ