lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160615122640.GC1607@quack2.suse.cz>
Date:	Wed, 15 Jun 2016 14:26:40 +0200
From:	Jan Kara <jack@...e.cz>
To:	Tahsin Erdogan <tahsin@...gle.com>
Cc:	Jens Axboe <axboe@...nel.dk>, Tejun Heo <tj@...nel.org>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Jan Kara <jack@...e.com>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] writeback: inode cgroup wb switch should skip inode with
 zero i_count

On Mon 13-06-16 15:37:09, Tahsin Erdogan wrote:
> Asynchronous wb switching of inodes takes an additional ref count on an
> inode to make sure inode remains valid until switchover is completed.
> 
> However, it is possible that inode->i_count has already reached zero
> while inode is in writeback queue:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 917 at fs/inode.c:397 ihold+0x2b/0x30
> CPU: 1 PID: 917 Comm: kworker/u4:5 Not tainted 4.7.0-rc2+ #49
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
> 01/01/2011
> Workqueue: writeback wb_workfn (flush-8:16)
>  0000000000000000 ffff88007ca0fb58 ffffffff805990af 0000000000000000
>  0000000000000000 ffff88007ca0fb98 ffffffff80268702 0000018d000004e2
>  ffff88007cef40e8 ffff88007c9b89a8 ffff880079e3a740 0000000000000003
> Call Trace:
>  [<ffffffff805990af>] dump_stack+0x4d/0x6e
>  [<ffffffff80268702>] __warn+0xc2/0xe0
>  [<ffffffff802687d8>] warn_slowpath_null+0x18/0x20
>  [<ffffffff8035b4ab>] ihold+0x2b/0x30
>  [<ffffffff80367ecc>] inode_switch_wbs+0x11c/0x180
>  [<ffffffff80369110>] wbc_detach_inode+0x170/0x1a0
>  [<ffffffff80369abc>] writeback_sb_inodes+0x21c/0x530
>  [<ffffffff80369f7e>] wb_writeback+0xee/0x1e0
>  [<ffffffff8036a147>] wb_workfn+0xd7/0x280
>  [<ffffffff80287531>] ? try_to_wake_up+0x1b1/0x2b0
>  [<ffffffff8027bb09>] process_one_work+0x129/0x300
>  [<ffffffff8027be06>] worker_thread+0x126/0x480
>  [<ffffffff8098cde7>] ? __schedule+0x1c7/0x561
>  [<ffffffff8027bce0>] ? process_one_work+0x300/0x300
>  [<ffffffff80280ff4>] kthread+0xc4/0xe0
>  [<ffffffff80335578>] ? kfree+0xc8/0x100
>  [<ffffffff809903cf>] ret_from_fork+0x1f/0x40
>  [<ffffffff80280f30>] ? __kthread_parkme+0x70/0x70
> ---[ end trace aaefd2fd9f306bc4 ]---
> 
> Acked-by: Tejun Heo <tj@...nel.org>
> Signed-off-by: Tahsin Erdogan <tahsin@...gle.com>

Ugh, this looks ugly. Inode with i_count == 0 without I_FREEING set is
sitting in inode LRU list. It may get reused at which point it would be
actually good if it switched WB to the good one, no?

Since we actually hold i_lock and have checked the inode is not being
freed, we can just use __iget() to grab the inode reference. That avoids
the warning and fixes the race as well. Something like:

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 989a2ce..b44ede0 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -478,14 +478,15 @@ static void inode_switch_wbs(struct inode *inode, int new_wb_id)
 		goto out_free;
 	}
 	inode->i_state |= I_WB_SWITCH;
+	__iget(inode);
 	spin_unlock(&inode->i_lock);
 
-	ihold(inode);
 	isw->inode = inode;
 
 	atomic_inc(&isw_nr_in_flight);

Thoughts?

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ