[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250919195223.1560636-3-joshua.hahnjy@gmail.com>
Date: Fri, 19 Sep 2025 12:52:20 -0700
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
Johannes Weiner <hannes@...xchg.org>
Cc: Chris Mason <clm@...com>,
Kiryl Shutsemau <kirill@...temov.name>,
Brendan Jackman <jackmanb@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Zi Yan <ziy@...dia.com>,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
kernel-team@...a.com
Subject: [PATCH 2/4] mm/page_alloc: Perform appropriate batching in drain_pages_zone
drain_pages_zone completely drains a zone of its pcp free pages by
repeatedly calling free_pcppages_bulk until pcp->count reaches 0.
In this loop, it already performs batched calls to ensure that
free_pcppages_bulk isn't called to free too many pages at once, and
relinquishes & reacquires the lock between each call to prevent
lock starvation from other processes.
However, the current batching does not prevent lock starvation. The
current implementation creates batches of
pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX, which has been seen in
Meta workloads to be up to 64 << 5 == 2048 pages.
While it is true that CONFIG_PCP_BATCH_SCALE_MAX is a config and
indeed can be adjusted by the system admin to be any number from
0 to 6, it's default value of 5 is still too high to be reasonable for
any system.
Instead, let's create batches of pcp->batch pages, which gives a more
reasonable 64 pages per call to free_pcppages_bulk. This gives other
processes a chance to grab the lock and prevents starvation. Each
individual call to drain_pages_zone may take longer, but we avoid the
worst case scenario of completely starving out other system-critical
threads from acquiring the pcp lock while 2048 pages are freed
one-by-one.
Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
---
mm/page_alloc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 77e7d9a5f149..b861b647f184 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2623,8 +2623,7 @@ static void drain_pages_zone(unsigned int cpu, struct zone *zone)
spin_lock(&pcp->lock);
count = pcp->count;
if (count) {
- int to_drain = min(count,
- pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX);
+ int to_drain = min(count, pcp->batch);
free_pcppages_bulk(zone, to_drain, pcp, 0);
count -= to_drain;
--
2.47.3
Powered by blists - more mailing lists