linux-ext4 - Re: [Bug 42763] directory access hangs without error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120217053324.GJ14132@dastard>
Date:	Fri, 17 Feb 2012 16:33:24 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Jan Kara <jack@...e.cz>
Cc:	bugzilla-daemon@...zilla.kernel.org, linux-ext4@...r.kernel.org,
	Al Viro <viro@...IV.linux.org.uk>,
	Dave Chinner <dchinner@...hat.com>
Subject: Re: [Bug 42763] directory access hangs without error

On Tue, Feb 14, 2012 at 03:22:31PM +0100, Jan Kara wrote:
> On Mon 13-02-12 18:30:28, bugzilla-daemon@...zilla.kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=42763
> > --- Comment #6 from Eric Buddington <ebuddington@...leyan.edu>  2012-02-13 18:30:27 ---
> > The stuck threads look like this:
> > 
> > edu             D c023a2f4     0  9912      1 0x00000004
> > f50b2b80 00000086 00000000 c023a2f4 f7b2b400 d5350000 c09f6d80 00000000
> > c09f6d80 c1c5f500 0000000a c33dbee0 c023f172 00000000 d53515cc c33dbee0
> > 000015cc d5352000 c8c4b4a4 c33dbee0 c1c5f500 f0e05dac c01558a1 00000246
> > Call Trace:
> > [<c023a2f4>] ? ext4_getblk+0x8b/0x13d
> > [<c023f172>] ? search_dirblock+0x76/0xaf
> > [<c01558a1>] ? arch_local_irq_save+0xf/0x14
> > [<c0651740>] ? _raw_spin_lock_irqsave+0x8/0x2c
> > [<c01c2cc3>] ? inode_wait+0x5/0x8
> > [<c0650c36>] ? __wait_on_bit+0x2f/0x54
> > [<c01c2cbe>] ? inode_owner_or_capable+0x30/0x30
> > [<c0650cba>] ? out_of_line_wait_on_bit+0x5f/0x67
> > [<c01c2cbe>] ? inode_owner_or_capable+0x30/0x30
> > [<c014532b>] ? autoremove_wake_function+0x2f/0x2f
> > [<c01c3610>] ? wait_on_bit.constprop.13+0x22/0x25
> > [<c01c3c8b>] ? iget_locked+0x42/0xc5
> > [<c023aad8>] ? ext4_iget+0x24/0x5be
>   ...
>   Interesting. So this isn't ext4 related at all. Rather it's a generic bug
> in VFS's I_NEW handling introduced by 250df6ed (adding Dave and Al to CC).
> That commit removed wake_up_inode() (in particular a memory barrier before
> wake_up_bit()) on the basis that i_state transitions are protected by
> i_lock. That would be fine if all the readers of i_state were using i_lock
> as well.

Hmmmm. I guess I missed that one.

> But they don't - in particular wait_on_inode() from
> include/linux/writeback.h does not. So that commit opened a reordering
> possibility where __I_NEW can be cleared *after* wake_up_bit() in
> unlock_new_inode() happens and so wait_on_bit() in wait_on_inode() goes
> to sleep indefinitely.
> 
> It seems to me the intent was that wait_on_inode() should use i_lock as
> well so it would opencode bit waiting similarly to
> __wait_on_freeing_inode().

Yeah, more like inode_wait_for_writeback() rather than
__wait_on_freeing_inode(), though, as we should loop until the bit
is cleared.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html