linux-kernel - Re: [PATCH] [PATCH] mm: disable preemption before swapcache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-Id: <20180725141643.6d9ba86a9698bc2580836618@linux-foundation.org>
Date:   Wed, 25 Jul 2018 14:16:43 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     "zhaowuyun@...gtech.com" <zhaowuyun@...gtech.com>
Cc:     mgorman <mgorman@...hsingularity.net>,
        minchan <minchan@...nel.org>, vinmenon <vinmenon@...eaurora.org>,
        mhocko <mhocko@...e.com>, hannes <hannes@...xchg.org>,
        "hillf.zj" <hillf.zj@...baba-inc.com>,
        linux-mm <linux-mm@...ck.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH] [PATCH] mm: disable preemption before swapcache_free

On Wed, 25 Jul 2018 14:37:58 +0800 "zhaowuyun@...gtech.com" <zhaowuyun@...gtech.com> wrote:

> From: zhaowuyun <zhaowuyun@...gtech.com>
>  
> issue is that there are two processes A and B, A is kworker/u16:8
> normal priority, B is AudioTrack, RT priority, they are on the
> same CPU 3.
>  
> The task A preempted by task B in the moment
> after __delete_from_swap_cache(page) and before swapcache_free(swap).
>  
> The task B does __read_swap_cache_async in the do {} while loop, it
> will never find the page from swapper_space because the page is removed
> by the task A, and it will never sucessfully in swapcache_prepare because
> the entry is EEXIST.
>  
> The task B then stuck in the loop infinitely because it is a RT task,
> no one can preempt it.
>  
> so need to disable preemption until the swapcache_free executed.

Yes, right, sorry, I must have merged cbab0e4eec299 in my sleep. 
cond_resched() is a no-op in the presence of realtime policy threads
and using to attempt to yield to a different thread it in this fashion
is broken.

Disabling preemption on the other side of the race should fix things,
but it's using a bandaid to plug the leakage from the earlier bandaid. 
The proper way to coordinate threads is to use a sleeping lock, such
as a mutex, or some other wait/wakeup mechanism.

And once that's done, we can hopefully eliminate the do loop from
__read_swap_cache_async().  That also services ENOMEM from
radix_tree_insert(), but __add_to_swap_cache() appears to handle that
OK and we shouldn't just loop around retrying the insert and the
radix_tree_preload() should ensure that radix_tree_insert() never fails
anyway.  Unless we're calling __read_swap_cache_async() with screwy
gfp_flags from somewhere.