[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251201023647.2538502-1-zhanghongru@xiaomi.com>
Date: Mon, 1 Dec 2025 10:36:47 +0800
From: Hongru Zhang <zhanghongru06@...il.com>
To: vbabka@...e.cz
Cc: Liam.Howlett@...cle.com,
akpm@...ux-foundation.org,
axelrasmussen@...gle.com,
david@...nel.org,
hannes@...xchg.org,
jackmanb@...gle.com,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
lorenzo.stoakes@...cle.com,
mhocko@...e.com,
rppt@...nel.org,
surenb@...gle.com,
weixugc@...gle.com,
yuanchu@...gle.com,
zhanghongru06@...il.com,
zhanghongru@...omi.com,
ziy@...dia.com
Subject: Re: [PATCH 0/3] mm: add per-migratetype counts to buddy allocator and optimize pagetypeinfo access
> > On mobile devices, some user-space memory management components check
> > memory pressure and fragmentation status periodically or via PSI, and
> > take actions such as killing processes or performing memory compaction
> > based on this information.
>
> Hm /proc/buddyinfo could be enough to determine fragmentation? Also we have
> in-kernel proactive compaction these days.
In fact, besides /proc/pagetypeinfo, other system resource information is
also collected at appropriate times, and resource usage throughout the
process lifecycle is appropriately tracked as well. User-space management
components integrate this information together to make decisions and
perform proper actions.
> > Under high load scenarios, reading /proc/pagetypeinfo causes memory
> > management components or memory allocation/free paths to be blocked
> > for extended periods waiting for the zone lock, leading to the following
> > issues:
> > 1. Long interrupt-disabled spinlocks - occasionally exceeding 10ms on Qcom
> > 8750 platforms, reducing system real-time performance
> > 2. Memory management components being blocked for extended periods,
> > preventing rapid acquisition of memory fragmentation information for
> > critical memory management decisions and actions
> > 3. Increased latency in memory allocation and free paths due to prolonged
> > zone lock contention
>
> It could be argued that not capturing /proc/pagetypeinfo (often) would help.
> I wonder if we can find also other benefits from the counters in the kernel
> itself.
Collecting system and app resource statistics and making decisions based
on this information is a common practice among Android device manufacturers.
Currently, there should be over a billion Android phones being used daily
worldwide. The diversity of hardware configurations across Android devices
makes it difficult for kernel mechanisms alone to maintain good
performance across all usage scenarios.
First, hardware capabilities vary greatly - flagship phones may have up to
24GB of memory, while low-end devices may have as little as 4GB. CPU,
storage, battery, and passive cooling capabilities vary significantly due
to market positioning and cost factors. Hardware resources seem always
inadequate.
Second, usage scenarios also differ - some people use devices in hot
environments while others in cold environments; some enjoy high-definition
gaming while others simply browse the web.
Third, user habits vary as well. Some people rarely restart their phones
except when the battery dies or the system crashes; others restart daily,
like me. Some users never actively close apps, only switching them to
the background, resulting in dozens of apps running in the background and
keeping system resources consumed (especially memory). Yet others just use
a few apps, closing unused apps rather than leaving them in the
background.
Despite the above challenges, Android device manufacturers hope to ensure
a good user experience (no UI jank) across all situations.
Even at 60 Hz frame refresh rate (90 Hz, 120 Hz also supported now), all
work from user input to render and display should be done within 16.7 ms.
To achieve this goal, the management components perform tasks such as:
- Track system resource status: what system has
(system resource awareness)
- Learn and predict app resource demands: what app needs
(resource demand awareness)
- Monitor app launch, exit, and foreground-background switches: least
important app gives back resource to system to serve most important
one, usually the foreground app
(user intent awareness)
Tracking system resources seems necessary for Android devices, not
optional. So the related paths are not that cold on Android devices.
All the above are from workload perspective. From the kernel perspective,
regardless of when or how frequently user-space tools read statistical
information, they should not affect the kernel's own efficiency
significantly. That's why I submit this patch series to make the read side
of /proc/pagetypeinfo lock-free. But this does introduce overhead in hot
path, I would greatly appreciate if we can discuss how to improve it here.
> Adding these migratetype counters is something that wouldn't be even
> possible in the past, until the freelist migratetype hygiene was merged.
> So now it should be AFAIK possible, but it's still some overhead in
> relatively hot paths. I wonder if we even considered this before in the
> context of migratetype hygiene? Couldn't find anything quickly.
Yes, I wrote the code on old kernel initially, at that time, I reused
set_pcppage_migratetype (also renamed) to cache the exact migratetype
list that the page block is on. After the freelist migratetype hygiene
patches were merged, I removed that logic.
Powered by blists - more mailing lists