lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100115133322.GA4172@discord.disaster>
Date:	Sat, 16 Jan 2010 00:33:22 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, mingo@...hat.com,
	Christoph Hellwig <hch@...radead.org>,
	Nick Piggin <nickpiggin@...oo.com.au>
Subject: Re: lockdep: inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R}
	usage.

On Fri, Jan 15, 2010 at 01:53:15PM +0100, Peter Zijlstra wrote:
> On Fri, 2010-01-15 at 23:44 +1100, Dave Chinner wrote:
> > 
> > > > I can't work out what the <mumble>RECLAIM_FS<mumble> notations are
> > > > supposed to mean from the code and they are not documented at
> > > > all, so I need someone to explain what this means before I can
> > > > determine if it is a valid warning or not....
> > > 
> > > The <mumble>RECLAIM_FS<mumble> bit means that lock (iprune_sem) was
> > > taken from reclaim and is also taken over an allocation.
> > 
> > So there's an implicit, undocumented requirement that inode reclaim
> > during unmount requires a filesystem to do GFP_NOFS allocation? 
> 
> Well, I don't know enough about xfs (of filesystems in generic) to say
> that with any certainty, but I can imagine inode writeback from the sync
> that goes with umount to cause issues.
> 
> If this inode reclaim is past all that and the filesystem is basically
> RO, then I don't think so and this could be considered a false positive,
> in which case we need an annotation for this.

The issue is that the iprune_sem is held write locked over
dispose_list() even though the inodes have been removed from the
unused list. While iprune_sem is held write locked, we can't enter
shrink_icache_memory because that takes the iprune_sem in read mode.
Hence allocation anywhere in the dispose_list path has to be
GFP_NOFS to avoid this.

XFS relies on the PF_MEMALLOC flag to clear the __GFP_FS
flag in allocations so that the same code paths work in both
normal and reclaim situations (like _xfs_trans_alloc), but the
unmount path sets no such flag. Setting this flag would
avoid the problem, but is messy.

FWIW, I'm not sure why we need to hold the iprune_sem after the
inodes have been detached from the unused list in the unmount path.
The iprune_sem is there to prevent against concurrent access by the
shrink_icache_memory path, so once all the inodes are isolated it
seems the iprune_sem is not needed anymore. Of course, this code is
a maze of twisty passages, so there's likely to be something I've
missed that means that this is the only way it can work....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ