lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110506163312.GH18982@quack.suse.cz>
Date:	Fri, 6 May 2011 18:33:12 +0200
From:	Jan Kara <jack@...e.cz>
To:	Wu Fengguang <fengguang.wu@...el.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>, Jan Kara <jack@...e.cz>,
	Dave Chinner <david@...morbit.com>,
	Christoph Hellwig <hch@...radead.org>,
	linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 02/17] writeback: update dirtied_when for synced inode
 to prevent livelock

On Fri 06-05-11 11:08:23, Wu Fengguang wrote:
> With the more aggressive "keep writeback as long as we wrote something"
> logic in wb_writeback(), the "use LONG_MAX .nr_to_write" trick in commit
> b9543dac5bbc ("writeback: avoid livelocking WB_SYNC_ALL writeback") is
> no longer enough to stop sync livelock.
> 
> The fix is to explicitly update .dirtied_when on synced inodes, so that
> they are no longer considered for writeback in the next round.
  The changelog doesn't make sense now when the patch is in the second
place in the series... Also as we discussed in the previous iteration of
the patches, I thought you'll move the condition before mapping_tagged()
test. I.e. the code would be (comment updated):
	if (!(inode->i_state & I_FREEING)) {
		/*
		 * Sync livelock prevention: Each inode is tagged and
		 * synced in one shot. So we can unconditionally update its
		 * dirty time to prevent syncing it again. Note that time
		 * ordering of b_dirty list will be kept because the
		 * following code either removes the inode from b_dirty
		 * or calls redirty_tail().
		 */
		if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_sync)
			inode->dirtied_when = jiffies;
		if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
...

								Honza
> 
> CC: Jan Kara <jack@...e.cz>
> Signed-off-by: Wu Fengguang <fengguang.wu@...el.com>
> ---
>  fs/fs-writeback.c |    9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> ext3/ext4 are working fine now, however tests show that XFS may still
> livelock inside the XFS routines:
> 
> [ 3581.181253] sync            D ffff8800b6ca15d8  4560  4403   4392 0x00000000
> [ 3581.181734]  ffff88006f775bc8 0000000000000046 ffff8800b6ca12b8 00000001b6ca1938
> [ 3581.182411]  ffff88006f774000 00000000001d2e40 00000000001d2e40 ffff8800b6ca1280
> [ 3581.183088]  00000000001d2e40 ffff88006f775fd8 00000340af111ef2 00000000001d2e40
> [ 3581.183765] Call Trace:
> [ 3581.184008]  [<ffffffff8109be73>] ? lock_release_holdtime+0xa3/0xab
> [ 3581.184392]  [<ffffffff8108cc0d>] ? prepare_to_wait+0x6c/0x79
> [ 3581.184756]  [<ffffffff8108cc0d>] ? prepare_to_wait+0x6c/0x79
> [ 3581.185120]  [<ffffffff812ed520>] xfs_ioend_wait+0x87/0x9f
> [ 3581.185474]  [<ffffffff8108c97a>] ? wake_up_bit+0x2a/0x2a
> [ 3581.185827]  [<ffffffff812f742a>] xfs_sync_inode_data+0x92/0x9d
> [ 3581.186198]  [<ffffffff812f76e2>] xfs_inode_ag_walk+0x1a5/0x287
> [ 3581.186569]  [<ffffffff812f779b>] ? xfs_inode_ag_walk+0x25e/0x287
> [ 3581.186946]  [<ffffffff812f7398>] ? xfs_sync_worker+0x69/0x69
> [ 3581.187311]  [<ffffffff812e2354>] ? xfs_perag_get+0x68/0xd0
> [ 3581.187669]  [<ffffffff81092175>] ? local_clock+0x41/0x5a
> [ 3581.188020]  [<ffffffff8109be73>] ? lock_release_holdtime+0xa3/0xab
> [ 3581.188403]  [<ffffffff812e22ec>] ? xfs_check_sizes+0x160/0x160
> [ 3581.188773]  [<ffffffff812e2354>] ? xfs_perag_get+0x68/0xd0
> [ 3581.189130]  [<ffffffff812e236c>] ? xfs_perag_get+0x80/0xd0
> [ 3581.189488]  [<ffffffff812e22ec>] ? xfs_check_sizes+0x160/0x160
> [ 3581.189858]  [<ffffffff812f7831>] ? xfs_inode_ag_iterator+0x6d/0x8f
> [ 3581.190241]  [<ffffffff812f7398>] ? xfs_sync_worker+0x69/0x69
> [ 3581.190606]  [<ffffffff812f780b>] xfs_inode_ag_iterator+0x47/0x8f
> [ 3581.190982]  [<ffffffff811611f5>] ? __sync_filesystem+0x7a/0x7a
> [ 3581.191352]  [<ffffffff812f7877>] xfs_sync_data+0x24/0x43
> [ 3581.191703]  [<ffffffff812f7911>] xfs_quiesce_data+0x2c/0x88
> [ 3581.192065]  [<ffffffff812f5556>] xfs_fs_sync_fs+0x21/0x48
> [ 3581.192419]  [<ffffffff811611e1>] __sync_filesystem+0x66/0x7a
> [ 3581.192783]  [<ffffffff8116120b>] sync_one_sb+0x16/0x18
> [ 3581.193128]  [<ffffffff8113e3e3>] iterate_supers+0x72/0xce
> [ 3581.193482]  [<ffffffff81161140>] sync_filesystems+0x20/0x22
> [ 3581.193842]  [<ffffffff8116127e>] sys_sync+0x21/0x33
> [ 3581.194177]  [<ffffffff819016c2>] system_call_fastpath+0x16/0x1b
> 
> --- linux-next.orig/fs/fs-writeback.c	2011-05-05 23:30:22.000000000 +0800
> +++ linux-next/fs/fs-writeback.c	2011-05-05 23:30:23.000000000 +0800
> @@ -432,6 +432,15 @@ writeback_single_inode(struct inode *ino
>  				requeue_io(inode);
>  			} else {
>  				/*
> +				 * sync livelock prevention: each inode is
> +				 * tagged and synced in one shot. If still
> +				 * dirty, move it back to s_dirty with updated
> +				 * dirty time to prevent syncing it again.
> +				 */
> +				if (wbc->sync_mode == WB_SYNC_ALL ||
> +				    wbc->tagged_sync)
> +					inode->dirtied_when = jiffies;
> +				/*
>  				 * Writeback blocked by something other than
>  				 * congestion. Delay the inode for some time to
>  				 * avoid spinning on the CPU (100% iowait)
> 
> 
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ