[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2fb78e41-5e56-425f-925f-a29524355d2c@efficios.com>
Date: Fri, 19 Dec 2025 11:01:11 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Mark Brown <broonie@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>, linux-kernel@...r.kernel.org,
"Paul E. McKenney" <paulmck@...nel.org>, Steven Rostedt
<rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>,
Dennis Zhou <dennis@...nel.org>, Tejun Heo <tj@...nel.org>,
Christoph Lameter <cl@...ux.com>, Martin Liu <liumartin@...gle.com>,
David Rientjes <rientjes@...gle.com>, christian.koenig@....com,
Shakeel Butt <shakeel.butt@...ux.dev>, SeongJae Park <sj@...nel.org>,
Michal Hocko <mhocko@...e.com>, Johannes Weiner <hannes@...xchg.org>,
Sweet Tea Dorminy <sweettea-kernel@...miny.me>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R . Howlett" <liam.howlett@...cle.com>, Mike Rapoport
<rppt@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>, Christian Brauner <brauner@...nel.org>,
Wei Yang <richard.weiyang@...il.com>, David Hildenbrand <david@...hat.com>,
Miaohe Lin <linmiaohe@...wei.com>, Al Viro <viro@...iv.linux.org.uk>,
linux-mm@...ck.org, linux-trace-kernel@...r.kernel.org,
Yu Zhao <yuzhao@...gle.com>, Roman Gushchin <roman.gushchin@...ux.dev>,
Mateusz Guzik <mjguzik@...il.com>, Matthew Wilcox <willy@...radead.org>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Aboorva Devarajan <aboorvad@...ux.ibm.com>,
Aishwarya TCV <Aishwarya.TCV@....com>
Subject: Re: [PATCH v10 2/3] mm: Fix OOM killer inaccuracy on large many-core
systems
On 2025-12-19 04:31, Mark Brown wrote:
> On Thu, Dec 18, 2025 at 05:18:04PM -0500, Mathieu Desnoyers wrote:
>> On 2025-12-18 13:00, Mark Brown wrote:
>
>> An ugly work-around that may work (and then we can improve on this),
>> at the end of mm/init-mm.c:init_mm (completely untested):
>
>> .cpu_bitmap = { [0 ... ((3*BITS_TO_LONGS(NR_CPUS))-1 + ((69905 * NR_MM_COUNTERS * 64) / BYTES_PER_LONG))] = 0UL },
>
> That doesn't seem to fix the FVP unfortunately (BYTES_PER_LONG doesn't
> exist, but even just deleting the division entirely fails in the same
> way).
I just noticed that there is another static instance of mm_struct:
drivers/firmware/efi/efi.c struct mm_struct efi_mm
we need to apply the same fix to it as well. It seems to fit
with the currently running task when the oops happens:
[ 2.482454] CPU: 2 UID: 0 PID: 12 Comm: kworker/u32:0 Not tainted 6.19.0-rc1-next-20251218 #1 PREEMPT
[ 2.482609] Workqueue: efi_rts_wq efi_call_rts
[...]
[ 2.485094] acct_account_cputime+0x40/0xa4 (P)
[...]
[ 2.487172] el1h_64_irq+0x6c/0x70
[...]
[ 2.487853] __efi_rt_asm_wrapper+0x50/0x74
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists