linux-kernel - Re: [PATCH] mm: clear __GFP_FS when PF_MEMALLOC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <540555BE-7985-4468-BC03-45CDA7E2EB83@cam.ac.uk>
Date:	Thu, 4 Sep 2014 09:05:23 +0100
From:	Anton Altaparmakov <aia21@....ac.uk>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Junxiao Bi <junxiao.bi@...cle.com>, david@...morbit.com,
	xuejiufei@...wei.com, ming.lei@...onical.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set

On 4 Sep 2014, at 03:30, Andrew Morton <akpm@...ux-foundation.org> wrote:
> __GFP_FS and __GFP_IO are (or were) for communicating to vmscan: don't
> enter the fs for writepage, don't write back swapcache.
> 
> I guess those concepts have grown over time without a ton of thought
> going into it.  Yes, I suppose that if a filesystem's writepage is
> called (for example) it expects that it will be able to perform
> writeback and it won't check (or even be passed) the __GFP_IO setting.
> 
> So I guess we could say that !__GFP_FS && GFP_IO is not implemented and
> shouldn't occur.
> 
> That being said, it still seems quite bad to disable VFS cache
> shrinking for PF_MEMALLOC_NOIO allocation attempts.

I think what it really boils down to is that file systems cannot allow recursion into _that_ file system so if VFS/VM shrinking could skip over all inodes/dentries/pages that are associated with the superblock of the volume for which the allocation is being done then that would be just fine.

An alternative would be that the file systems would need to be passed in a flag that will tell them that it is not safe to take locks and then file systems that need to take a lock could return with -EDEADLOCK and the VM can then skip over those entries and reclaim others.  Though I think it would be more efficient for the VFS/VM to simply not call into the file system that is doing the allocation as above...

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
University of Cambridge Information Services, Roger Needham Building
7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/