[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20230525082928.ovuz77znv763jx3e@techsingularity.net>
Date: Thu, 25 May 2023 09:29:28 +0100
From: Mel Gorman <mgorman@...hsingularity.net>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Michal Hocko <mhocko@...e.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH] mm: compaction: avoid GFP_NOFS ABBA deadlock
On Fri, May 19, 2023 at 01:13:59PM +0200, Johannes Weiner wrote:
> During stress testing with higher-order allocations, a deadlock
> scenario was observed in compaction: One GFP_NOFS allocation was
> sleeping on mm/compaction.c::too_many_isolated(), while all CPUs in
> the system were busy with compactors spinning on buffer locks held by
> the sleeping GFP_NOFS allocation.
>
> Reclaim is susceptible to this same deadlock; we fixed it by granting
> GFP_NOFS allocations additional LRU isolation headroom, to ensure it
> makes forward progress while holding fs locks that other reclaimers
> might acquire. Do the same here.
>
> This code has been like this since compaction was initially merged,
> and I only managed to trigger this with out-of-tree patches that
> dramatically increase the contexts that do GFP_NOFS compaction. While
> the issue is real, it seems theoretical in nature given existing
> allocation sites. Worth fixing now, but no Fixes tag or stable CC.
>
> Signed-off-by: Johannes Weiner <hannes@...xchg.org>
Acked-by: Mel Gorman <mgorman@...hsingularity.net>
> ---
> mm/compaction.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> v2:
> - clarify too_many_isolated() comment (Mel)
> - split isolation deadlock from no-contiguous-anon lockups as that's
> a different scenario and deserves its own patch
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index c8bcdea15f5f..c9a4b6dffcf2 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -745,8 +745,9 @@ isolate_freepages_range(struct compact_control *cc,
> }
>
> /* Similar to reclaim, but different enough that they don't share logic */
> -static bool too_many_isolated(pg_data_t *pgdat)
> +static bool too_many_isolated(struct compact_control *cc)
> {
> + pg_data_t *pgdat = cc->zone->zone_pgdat;
> bool too_many;
>
> unsigned long active, inactive, isolated;
> @@ -758,6 +759,17 @@ static bool too_many_isolated(pg_data_t *pgdat)
> isolated = node_page_state(pgdat, NR_ISOLATED_FILE) +
> node_page_state(pgdat, NR_ISOLATED_ANON);
>
> + /*
> + * Allow GFP_NOFS to isolate past the limit set for regular
> + * compaction runs. This prevents an ABBA deadlock when other
> + * compactors have already isolated to the limit, but are
> + * blocked on filesystem locks held by the GFP_NOFS thread.
> + */
> + if (cc->gfp_mask & __GFP_FS) {
> + inactive >>= 3;
> + active >>= 3;
> + }
> +
> too_many = isolated > (inactive + active) / 2;
> if (!too_many)
> wake_throttle_isolated(pgdat);
> @@ -806,7 +818,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> * list by either parallel reclaimers or compaction. If there are,
> * delay for some time until fewer pages are isolated
> */
> - while (unlikely(too_many_isolated(pgdat))) {
> + while (unlikely(too_many_isolated(cc))) {
> /* stop isolation if there are still pages not migrated */
> if (cc->nr_migratepages)
> return -EAGAIN;
> --
> 2.40.0
>
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists