[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20210402174447.2abccc77cdca5cad67756d55@linux-foundation.org>
Date: Fri, 2 Apr 2021 17:44:47 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Stillinux <stillinux@...il.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
liuzhengyuan@...inos.cn, liuyun01@...inos.cn,
Johannes Weiner <hannes@...xchg.org>,
Hugh Dickins <hughd@...gle.com>
Subject: Re: [RFC PATCH] mm/swap: fix system stuck due to infinite loop
On Fri, 2 Apr 2021 15:03:37 +0800 Stillinux <stillinux@...il.com> wrote:
> In the case of high system memory and load pressure, we ran ltp test
> and found that the system was stuck, the direct memory reclaim was
> all stuck in io_schedule, the waiting request was stuck in the blk_plug
> flow of one process, and this process fell into an infinite loop.
> not do the action of brushing out the request.
>
> The call flow of this process is swap_cluster_readahead.
> Use blk_start/finish_plug for blk_plug operation,
> flow swap_cluster_readahead->__read_swap_cache_async->swapcache_prepare.
> When swapcache_prepare return -EEXIST, it will fall into an infinite loop,
> even if cond_resched is called, but according to the schedule,
> sched_submit_work will be based on tsk->state, and will not flash out
> the blk_plug request, so will hang io, causing the overall system hang.
>
> For the first time involving the swap part, there is no good way to fix
> the problem from the fundamental problem. In order to solve the
> engineering situation, we chose to make swap_cluster_readahead aware of
> the memory pressure situation as soon as possible, and do io_schedule to
> flush out the blk_plug request, thereby changing the allocation flag in
> swap_readpage to GFP_NOIO , No longer do the memory reclaim of flush io.
> Although system operating normally, but not the most fundamental way.
>
Thanks.
I'm not understanding why swapcache_prepare() repeatedly returns
-EEXIST in this situation?
And how does the switch to GFP_NOIO fix this? Simply by avoiding
direct reclaim altogether?
> ---
> mm/page_io.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/page_io.c b/mm/page_io.c
> index c493ce9ebcf5..87392ffabb12 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -403,7 +403,7 @@ int swap_readpage(struct page *page, bool synchronous)
> }
>
> ret = 0;
> - bio = bio_alloc(GFP_KERNEL, 1);
> + bio = bio_alloc(GFP_NOIO, 1);
> bio_set_dev(bio, sis->bdev);
> bio->bi_opf = REQ_OP_READ;
> bio->bi_iter.bi_sector = swap_page_sector(page);
Powered by blists - more mailing lists