linux-kernel - Re: [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160428215145.GM26977@dastard>
Date:	Fri, 29 Apr 2016 07:51:45 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Michal Hocko <mhocko@...nel.org>
Cc:	linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jan Kara <jack@...e.cz>, xfs@....sgi.com,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used
 explicitly from memalloc_no{fs,io}_{save,restore} context

On Thu, Apr 28, 2016 at 10:17:59AM +0200, Michal Hocko wrote:
> [Trim the CC list]
> On Wed 27-04-16 08:58:45, Dave Chinner wrote:
> [...]
> > Often these are to silence lockdep warnings (e.g. commit b17cb36
> > ("xfs: fix missing KM_NOFS tags to keep lockdep happy")) because
> > lockdep gets very unhappy about the same functions being called with
> > different reclaim contexts. e.g.  directory block mapping might
> > occur from readdir (no transaction context) or within transactions
> > (create/unlink). hence paths like this are tagged with GFP_NOFS to
> > stop lockdep emitting false positive warnings....
> 
> As already said in other email, I have tried to revert the above
> commit and tried to run it with some fs workloads but didn't manage
> to hit any lockdep splats (after I fixed my bug in the patch 1.2). I
> have tried to find reports which led to this commit but didn't succeed
> much. Everything is from much earlier or later. Do you happen to
> remember which loads triggered them, what they looked like or have an
> idea what to try to reproduce them? So far I was trying heavy parallel
> fs_mark, kernbench inside a tiny virtual machine so any of those have
> triggered direct reclaim all the time.

Most of those issues were reported by users and not reproducable by
any obvious means. They may have been fixed since, but I'm sceptical
of that because, generally speaking, developer testing only catches
the obvious lockdep issues. i.e. it's users that report all the
really twisty issues, and they are generally not reproducable except
under their production workloads...

IOWs, the absence of reports in your testing does not mean there
isn't a problem, and that is one of the biggest problems with
lockdep annotations - we have no way of ever knowing if they are
still necessary or not without exposing users to regressions and
potential deadlocks.....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com