[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20180725141643.6d9ba86a9698bc2580836618@linux-foundation.org>
Date: Wed, 25 Jul 2018 14:16:43 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: "zhaowuyun@...gtech.com" <zhaowuyun@...gtech.com>
Cc: mgorman <mgorman@...hsingularity.net>,
minchan <minchan@...nel.org>, vinmenon <vinmenon@...eaurora.org>,
mhocko <mhocko@...e.com>, hannes <hannes@...xchg.org>,
"hillf.zj" <hillf.zj@...baba-inc.com>,
linux-mm <linux-mm@...ck.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH] [PATCH] mm: disable preemption before swapcache_free
On Wed, 25 Jul 2018 14:37:58 +0800 "zhaowuyun@...gtech.com" <zhaowuyun@...gtech.com> wrote:
> From: zhaowuyun <zhaowuyun@...gtech.com>
>
> issue is that there are two processes A and B, A is kworker/u16:8
> normal priority, B is AudioTrack, RT priority, they are on the
> same CPU 3.
>
> The task A preempted by task B in the moment
> after __delete_from_swap_cache(page) and before swapcache_free(swap).
>
> The task B does __read_swap_cache_async in the do {} while loop, it
> will never find the page from swapper_space because the page is removed
> by the task A, and it will never sucessfully in swapcache_prepare because
> the entry is EEXIST.
>
> The task B then stuck in the loop infinitely because it is a RT task,
> no one can preempt it.
>
> so need to disable preemption until the swapcache_free executed.
Yes, right, sorry, I must have merged cbab0e4eec299 in my sleep.
cond_resched() is a no-op in the presence of realtime policy threads
and using to attempt to yield to a different thread it in this fashion
is broken.
Disabling preemption on the other side of the race should fix things,
but it's using a bandaid to plug the leakage from the earlier bandaid.
The proper way to coordinate threads is to use a sleeping lock, such
as a mutex, or some other wait/wakeup mechanism.
And once that's done, we can hopefully eliminate the do loop from
__read_swap_cache_async(). That also services ENOMEM from
radix_tree_insert(), but __add_to_swap_cache() appears to handle that
OK and we shouldn't just loop around retrying the insert and the
radix_tree_preload() should ensure that radix_tree_insert() never fails
anyway. Unless we're calling __read_swap_cache_async() with screwy
gfp_flags from somewhere.
Powered by blists - more mailing lists