[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190703083803.GA2737@techsingularity.net>
Date: Wed, 3 Jul 2019 09:38:03 +0100
From: Mel Gorman <mgorman@...hsingularity.net>
To: Shakeel Butt <shakeelb@...gle.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Vlastimil Babka <vbabka@...e.cz>,
Hillf Danton <hdanton@...a.com>, Roman Gushchin <guro@...com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm, vmscan: prevent useless kswapd loops
On Mon, Jul 01, 2019 at 01:18:47PM -0700, Shakeel Butt wrote:
> On production we have noticed hard lockups on large machines running
> large jobs due to kswaps hoarding lru lock within isolate_lru_pages when
> sc->reclaim_idx is 0 which is a small zone. The lru was couple hundred
> GiBs and the condition (page_zonenum(page) > sc->reclaim_idx) in
> isolate_lru_pages was basically skipping GiBs of pages while holding the
> LRU spinlock with interrupt disabled.
>
> On further inspection, it seems like there are two issues:
>
> 1) If the kswapd on the return from balance_pgdat() could not sleep
> (i.e. node is still unbalanced), the classzone_idx is unintentionally
> set to 0 and the whole reclaim cycle of kswapd will try to reclaim
> only the lowest and smallest zone while traversing the whole memory.
>
> 2) Fundamentally isolate_lru_pages() is really bad when the allocation
> has woken kswapd for a smaller zone on a very large machine running very
> large jobs. It can hoard the LRU spinlock while skipping over 100s of
> GiBs of pages.
>
> This patch only fixes the (1). The (2) needs a more fundamental solution.
> To fix (1), in the kswapd context, if pgdat->kswapd_classzone_idx is
> invalid use the classzone_idx of the previous kswapd loop otherwise use
> the one the waker has requested.
>
> Fixes: e716f2eb24de ("mm, vmscan: prevent kswapd sleeping prematurely
> due to mismatched classzone_idx")
>
> Signed-off-by: Shakeel Butt <shakeelb@...gle.com>
Acked-by: Mel Gorman <mgorman@...hsingularity.net>
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists