lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241112171428.UqPpObPV@linutronix.de>
Date: Tue, 12 Nov 2024 18:14:28 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Dave Chinner <david@...morbit.com>
Cc: Alex Shi <seakeel@...il.com>, linux-xfs@...r.kernel.org,
	Linux-MM <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: xfs deadlock on mm-unstable kernel?

On 2024-07-08 20:14:44 [+1000], Dave Chinner wrote:
> On Mon, Jul 08, 2024 at 04:36:08PM +0800, Alex Shi wrote:
> >   372.297234][ T3001] ============================================
> > [  372.297530][ T3001] WARNING: possible recursive locking detected
> > [  372.297827][ T3001] 6.10.0-rc6-00453-g2be3de2b70e6 #64 Not tainted
> > [  372.298137][ T3001] --------------------------------------------
> > [  372.298436][ T3001] cc1/3001 is trying to acquire lock:
> > [  372.298701][ T3001] ffff88802cb910d8 (&xfs_dir_ilock_class){++++}-{3:3}, at: xfs_reclaim_inode+0x59e/0x710
> > [  372.299242][ T3001] 
> > [  372.299242][ T3001] but task is already holding lock:
> > [  372.299679][ T3001] ffff88800e145e58 (&xfs_dir_ilock_class){++++}-{3:3}, at: xfs_ilock_data_map_shared+0x4d/0x60
> > [  372.300258][ T3001] 
> > [  372.300258][ T3001] other info that might help us debug this:
> > [  372.300650][ T3001]  Possible unsafe locking scenario:
> > [  372.300650][ T3001] 
> > [  372.301031][ T3001]        CPU0
> > [  372.301231][ T3001]        ----
> > [  372.301386][ T3001]   lock(&xfs_dir_ilock_class);
> > [  372.301623][ T3001]   lock(&xfs_dir_ilock_class);
> > [  372.301860][ T3001] 
> > [  372.301860][ T3001]  *** DEADLOCK ***
> > [  372.301860][ T3001] 
> > [  372.302325][ T3001]  May be due to missing lock nesting notation
> > [  372.302325][ T3001] 
> > [  372.302723][ T3001] 3 locks held by cc1/3001:
> > [  372.302944][ T3001]  #0: ffff88800e146078 (&inode->i_sb->s_type->i_mutex_dir_key){++++}-{3:3}, at: walk_component+0x2a5/0x500
> > [  372.303554][ T3001]  #1: ffff88800e145e58 (&xfs_dir_ilock_class){++++}-{3:3}, at: xfs_ilock_data_map_shared+0x4d/0x60
> > [  372.304183][ T3001]  #2: ffff8880040190e0 (&type->s_umount_key#48){++++}-{3:3}, at: super_cache_scan+0x82/0x4e0
> 
> False positive. Inodes above allocation must be actively referenced,
> and inodes accees by xfs_reclaim_inode() must have no references and
> been evicted and destroyed by the VFS. So there is no way that an
> unreferenced inode being locked for reclaim in xfs_reclaim_inode()
> can deadlock against the refrenced inode locked by the inode lookup
> code.
> 
> Unfortunately, we don't have enough lockdep subclasses available to
> annotate this correctly - we're already using all
> MAX_LOCKDEP_SUBCLASSES to tell lockdep about all the ways we can
> nest inode locks. That leaves us no space to add a "reclaim"
> annotation for locking done from super_cache_scan() paths that would
> avoid these false positives....

So the former inode (the one triggering the reclaim) is created and can
not be the same as the one in reclaim list. Couldn't we assign it a
different lock-class?
My guess would be that you drop the lockdep_set_class() in
xfs_setup_inode() and then do it in xfs_iget_cache_miss() before adding
it to the tree. So you would have one class initially and then change it
once it enters the tree. I guess once the inode is removed from the
tree, it goes to kfree().

> -Dave.

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ