[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250323034657.GD1894930@cmpxchg.org>
Date: Sat, 22 Mar 2025 23:46:57 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Brendan Jackman <jackmanb@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Mel Gorman <mgorman@...hsingularity.net>, Zi Yan <ziy@...dia.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/5] mm: page_alloc: defrag_mode
On Sat, Mar 22, 2025 at 09:34:09PM -0400, Johannes Weiner wrote:
> On Sat, Mar 22, 2025 at 08:58:27PM -0400, Johannes Weiner wrote:
> > On Sat, Mar 22, 2025 at 04:05:52PM +0100, Brendan Jackman wrote:
> > > On Thu Mar 13, 2025 at 10:05 PM CET, Johannes Weiner wrote:
> > > > + /* Reclaim/compaction failed to prevent the fallback */
> > > > + if (defrag_mode) {
> > > > + alloc_flags &= ALLOC_NOFRAGMENT;
> > > > + goto retry;
> > > > + }
> > >
> > > I can't see where ALLOC_NOFRAGMENT gets cleared, is it supposed to be
> > > here (i.e. should this be ~ALLOC_NOFRAGMENT)?
>
> Please ignore my previous email, this is actually a much more severe
> issue than I thought at first. The screwed up clearing is bad, but
> this will also not check the flag before retrying, which means the
> thread will retry reclaim/compaction and never reach OOM.
>
> This code has weeks of load testing, with workloads fine-tuned to
> *avoid* OOM. A blatant OOM test shows this problem immediately.
>
> A simple fix, but I'll put it through the wringer before sending it.
Ok, here is the patch. I verified this with intentional OOMing 100
times in a loop; this would previously lock up on first try in
defrag_mode, but kills and recovers reliably with this applied.
I also re-ran the full THP benchmarks, to verify that erroneous
looping here did not accidentally contribute to fragmentation
avoidance and thus THP success & latency rates. They were in fact not;
the improvements claimed for defrag_mode are unchanged with this fix:
VANILLA defrag_mode=1-OOMFIX
Hugealloc Time mean 52739.45 ( +0.00%) 27342.44 ( -48.15%)
Hugealloc Time stddev 56541.26 ( +0.00%) 33227.16 ( -41.23%)
Kbuild Real time 197.47 ( +0.00%) 196.32 ( -0.58%)
Kbuild User time 1240.49 ( +0.00%) 1231.89 ( -0.69%)
Kbuild System time 70.08 ( +0.00%) 58.75 ( -15.95%)
THP fault alloc 46727.07 ( +0.00%) 62669.93 ( +34.12%)
THP fault fallback 21910.60 ( +0.00%) 5966.40 ( -72.77%)
Direct compact fail 195.80 ( +0.00%) 50.53 ( -73.81%)
Direct compact success 7.93 ( +0.00%) 4.07 ( -43.28%)
Compact daemon scanned migrate 3369601.27 ( +0.00%) 1588238.93 ( -52.87%)
Compact daemon scanned free 5075474.47 ( +0.00%) 1441944.27 ( -71.59%)
Compact direct scanned migrate 161787.27 ( +0.00%) 64838.53 ( -59.92%)
Compact direct scanned free 163467.53 ( +0.00%) 37243.00 ( -77.22%)
Compact total migrate scanned 3531388.53 ( +0.00%) 1653077.47 ( -53.19%)
Compact total free scanned 5238942.00 ( +0.00%) 1479187.27 ( -71.77%)
Alloc stall 2371.07 ( +0.00%) 553.00 ( -76.64%)
Pages kswapd scanned 2160926.73 ( +0.00%) 4052539.93 ( +87.54%)
Pages kswapd reclaimed 533191.07 ( +0.00%) 765447.47 ( +43.56%)
Pages direct scanned 400450.33 ( +0.00%) 358933.93 ( -10.37%)
Pages direct reclaimed 94441.73 ( +0.00%) 26991.60 ( -71.42%)
Pages total scanned 2561377.07 ( +0.00%) 4411473.87 ( +72.23%)
Pages total reclaimed 627632.80 ( +0.00%) 792439.07 ( +26.26%)
Swap out 47959.53 ( +0.00%) 128511.80 ( +167.96%)
Swap in 7276.00 ( +0.00%) 27736.20 ( +281.16%)
File refaults 138043.00 ( +0.00%) 206198.40 ( +49.37%)
Many thanks for your careful review, Brendan.
---
>From c84651a46910448c6cfaf44885644fdb215f7f6a Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@...xchg.org>
Date: Sat, 22 Mar 2025 19:21:45 -0400
Subject: [PATCH] mm: page_alloc: fix defrag_mode's retry & OOM path
Brendan points out that defrag_mode doesn't properly clear
ALLOC_NOFRAGMENT on its last-ditch attempt to allocate. But looking
closer, the problem is actually more severe: it doesn't actually
*check* whether it's already retried, and keeps looping. This means
the OOM path is never taken, and the thread can loop indefinitely.
This is verified with an intentional OOM test on defrag_mode=1, which
results in the machine hanging. After this patch, it triggers the OOM
kill reliably and recovers.
Clear ALLOC_NOFRAGMENT properly, and only retry once.
Fixes: e3aa7df331bc ("mm: page_alloc: defrag_mode")
Reported-by: Brendan Jackman <jackmanb@...gle.com>
Signed-off-by: Johannes Weiner <hannes@...xchg.org>
---
mm/page_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0c01998cb3a0..582364d42906 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4543,8 +4543,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
goto retry;
/* Reclaim/compaction failed to prevent the fallback */
- if (defrag_mode) {
- alloc_flags &= ALLOC_NOFRAGMENT;
+ if (defrag_mode && (alloc_flags & ALLOC_NOFRAGMENT)) {
+ alloc_flags &= ~ALLOC_NOFRAGMENT;
goto retry;
}
--
2.49.0
Powered by blists - more mailing lists