linux-kernel - Re: [PATCH] mm/zswap: fix potential deadlock in zswap_frontswap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <eea593fd-c59d-cad0-936b-c012df1abadd@virtuozzo.com>
Date:   Mon, 3 Apr 2017 15:38:08 +0300
From:   Andrey Ryabinin <aryabinin@...tuozzo.com>
To:     Michal Hocko <mhocko@...nel.org>,
        Shakeel Butt <shakeelb@...gle.com>
CC:     Seth Jennings <sjenning@...hat.com>,
        Dan Streetman <ddstreet@...e.org>,
        Linux MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH] mm/zswap: fix potential deadlock in
 zswap_frontswap_store()



On 04/03/2017 03:37 PM, Andrey Ryabinin wrote:
> 
> 
> On 04/03/2017 11:47 AM, Michal Hocko wrote:
>> On Fri 31-03-17 10:00:30, Shakeel Butt wrote:
>>> On Fri, Mar 31, 2017 at 8:30 AM, Andrey Ryabinin
>>> <aryabinin@...tuozzo.com> wrote:
>>>> zswap_frontswap_store() is called during memory reclaim from
>>>> __frontswap_store() from swap_writepage() from shrink_page_list().
>>>> This may happen in NOFS context, thus zswap shouldn't use __GFP_FS,
>>>> otherwise we may renter into fs code and deadlock.
>>>> zswap_frontswap_store() also shouldn't use __GFP_IO to avoid recursion
>>>> into itself.
>>>>
>>>
>>> Is it possible to enter fs code (or IO) from zswap_frontswap_store()
>>> other than recursive memory reclaim? However recursive memory reclaim
>>> is protected through PF_MEMALLOC task flag. The change seems fine but
>>> IMHO reasoning needs an update. Adding Michal for expert opinion.
>>
>> Yes this is true. 
> 
> Actually, no. I think we have a bug in allocator which may lead to recursive direct reclaim.
> 
> E.g. for costly order allocations (or order > 0 && ac->migratetype != MIGRATE_MOVABLE)
> with __GFP_NOMEMALLOC (gfp_pfmemalloc_allowed() returns false)
> __alloc_pages_slowpath() may call __alloc_pages_direct_compact() and unconditionally clear PF_MEMALLOC:
> 
> __alloc_pages_direct_compact():
> ...
> 	current->flags |= PF_MEMALLOC;
> 	*compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
> 									prio);
> 	current->flags &= ~PF_MEMALLOC;
> 
> 
> 
> And later in __alloc_pages_slowpath():
> 
> 	/* Avoid recursion of direct reclaim */
> 	if (current->flags & PF_MEMALLOC)        <=== false
> 		goto nopage;
> 
> 	/* Try direct reclaim and then allocating */
> 	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
> 							&did_some_progress);
> 


Seems it was broken by

a8161d1ed6098506303c65b3701dedba876df42a
Author: Vlastimil Babka <vbabka@...e.cz>
Date:   Thu Jul 28 15:49:19 2016 -0700

    mm, page_alloc: restructure direct compaction handling in slowpath