lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <52D719EC.5060604@suse.com>
Date:	Wed, 15 Jan 2014 18:29:48 -0500
From:	Jeff Mahoney <jeffm@...e.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Knut Petersen <Knut_Petersen@...nline.de>
CC:	linux-kernel <linux-kernel@...r.kernel.org>,
	Al Viro <viro@...iv.linux.org.uk>,
	reiserfs-devel@...r.kernel.org
Subject: Re: [BUG 3.13.0-rc6] reiserfs possible circular locking dependency

On 1/3/14, 5:04 PM, Jeff Mahoney wrote:
> On 1/3/14, 2:46 PM, Linus Torvalds wrote:
>> On Fri, Jan 3, 2014 at 11:16 AM, Knut Petersen
>> <Knut_Petersen@...nline.de> wrote:
>>> Rebooting after a power failure on an openSuSE 13.1 system
>>> with kernel 3.13.0-rc6 triggered the attached lockdep warning.
>>
>> Hmm. It seems to be that the *normal* sequence should be:
>>
>>  - get i_mutex, call lookup, which gets sbi->lock (reiserfs_write_lock)
>>
>> but in the mounting path, we have special circumstances.
>>
>> That finish_unfinished() function does
>>
>>  - reiserfs_write_lock_nested() .
>>  - remove_save_link
>>  - iput(inode) with the write lock held
>>
>> and that can apparently end up taking i_mutex in open_xa_dir (and then
>> recursively the write lock, but that's an explicitly recursive lock,
>> so that part should be ok).
>>
>> Now, I don't think this can *really* deadlock with the normal order of
>> operations, because during mounting there is no other process that can
>> take those in the reverse order (since the filesystem isn't live), but
>> I do wonder if we should just release the reiserfs write lock over the
>> iputs. We release it in other parts anyway (like for the quota off)
>>
>> Jeff, you already touched this exact case in commit d2d0395fd177
>> ("reiserfs: locking, release lock around quota operations") except
>> that was for those quota operation cases.
>>
>> Even if it's not a real problem, making lockdep happy sounds like a
>> good idea. Of course, the trouble is that this code path almost never
>> gets exercised (which is why this hasn't been noticed earlier), so
>> testing...
>>
>> Jeff? Comments?
> 
> If someone ever invents a time machine, I'd go back to 2004 and tell
> myself to fight harder to make a reiserfs v3.7 with real extended
> attribute items. This code will haunt me to my death.
> 
> Anyway, yeah. The right thing here is to drop the lock for the iput.
> More than that would be ok too. finish_unfinished happens when the file
> system goes read-write and that includes the remount path. There can be
> other users of the file system but it would be a recursive acquire so we
> wouldn't actually deadlock there.
> 
> I'll work something up over the weekend or on Monday.

As a quick update here, I do have patches to fix this particular issue
but it's tough to depend on xfstests to detect regressions when xfstests
causes other lockdep issues. I'm taking this an an opportunity to clean
up the locking enough to pass xfstests.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


Download attachment "signature.asc" of type "application/pgp-signature" (842 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ