lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160427225514.GJ26977@dastard>
Date:	Thu, 28 Apr 2016 08:55:14 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Michal Hocko <mhocko@...nel.org>
Cc:	linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Theodore Ts'o <tytso@....edu>, Chris Mason <clm@...com>,
	Jan Kara <jack@...e.cz>, ceph-devel@...r.kernel.org,
	cluster-devel@...hat.com, linux-nfs@...r.kernel.org,
	logfs@...fs.org, xfs@....sgi.com, linux-ext4@...r.kernel.org,
	linux-btrfs@...r.kernel.org, linux-mtd@...ts.infradead.org,
	reiserfs-devel@...r.kernel.org,
	linux-ntfs-dev@...ts.sourceforge.net,
	linux-f2fs-devel@...ts.sourceforge.net,
	linux-afs@...ts.infradead.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used
 explicitly from memalloc_no{fs,io}_{save,restore} context

On Wed, Apr 27, 2016 at 10:03:11AM +0200, Michal Hocko wrote:
> On Wed 27-04-16 08:58:45, Dave Chinner wrote:
> > On Tue, Apr 26, 2016 at 01:56:12PM +0200, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@...e.com>
> > > 
> > > THIS PATCH IS FOR TESTING ONLY AND NOT MEANT TO HIT LINUS TREE
> > > 
> > > It is desirable to reduce the direct GFP_NO{FS,IO} usage at minimum and
> > > prefer scope usage defined by memalloc_no{fs,io}_{save,restore} API.
> > > 
> > > Let's help this process and add a debugging tool to catch when an
> > > explicit allocation request for GFP_NO{FS,IO} is done from the scope
> > > context. The printed stacktrace should help to identify the caller
> > > and evaluate whether it can be changed to use a wider context or whether
> > > it is called from another potentially dangerous context which needs
> > > a scope protection as well.
> > 
> > You're going to get a large number of these from XFS. There are call
> > paths in XFs that get called both inside and outside transaction
> > context, and many of them are marked with GFP_NOFS to prevent issues
> > that have cropped up in the past.
> > 
> > Often these are to silence lockdep warnings (e.g. commit b17cb36
> > ("xfs: fix missing KM_NOFS tags to keep lockdep happy")) because
> > lockdep gets very unhappy about the same functions being called with
> > different reclaim contexts. e.g.  directory block mapping might
> > occur from readdir (no transaction context) or within transactions
> > (create/unlink). hence paths like this are tagged with GFP_NOFS to
> > stop lockdep emitting false positive warnings....
> 
> I would much rather see lockdep being fixed than abusing GFP_NOFS to
> workaround its limitations. GFP_NOFS has a real consequences to the
> memory reclaim. I will go and check the commit you mentioned and try
> to understand why that is a problem. From what you described above
> I would like to get rid of exactly this kind of usage which is not
> really needed for the recursion protection.

The problem is that every time we come across this, the answer is
"use lockdep annotations". Our lockdep annotations are freakin'
complex because of this, and more often than not lockdep false
positives occur due to bugs in the annotations. e.g. see
fs/xfs/xfs_inode.h for all the inode locking annotations we have to
use and the hoops we have to jump through because we are limited to
8 subclasses and we have to be able to annotate nested inode locks
5 deep in places (RENAME_WHITEOUT, thanks).

At one point, we had to reset lockdep classes for inodes in reclaim
so that they didn't throw lockdep false positives the moment an
inode was locked in a memory reclaim context. We had to change
locking to remove that problem (commit 4f59af7 ("xfs: remove iolock
lock classes"). Then there were all the problems with reclaim
triggering lockdep warnings on directory inodes - we had to add a
separate directory inode class for them, and even then we still need
GFP_NOFS in places to minimise reclaim noise (as per the above
commit).

Put simply: we've had to resort to designing locking and allocation
strategies around the limitations of lockdep annotations, as opposed
to what is actually possible or even optimal. i.e. when the choice
is a 2 minute fix to add GFP_NOFS in cases like this, versus another
week long effort to rewrite the inode annotations (again) like this
one last year:

commit 0952c8183c1575a78dc416b5e168987ff98728bb
Author: Dave Chinner <dchinner@...hat.com>
Date:   Wed Aug 19 10:32:49 2015 +1000

    xfs: clean up inode lockdep annotations
    
    Lockdep annotations are a maintenance nightmare. Locking has to be
    modified to suit the limitations of the annotations, and we're
    always having to fix the annotations because they are unable to
    express the complexity of locking heirarchies correctly.
.....

It's a no-brainer to see why GFP_NOFS will be added to the
allocation in question. I've been saying for years that I consider
lockdep harmful - if you want to get rid of GFP_NOFS, then you're
going to need to sort out the lockdep reclaim annotation mess at the
same time...

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ