[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a89cb4d9-088e-4ed6-afde-f1b097de8db9@efficios.com>
Date: Thu, 27 Mar 2025 16:35:35 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Mateusz Guzik <mjguzik@...il.com>,
Sweet Tea Dorminy <sweettea-kernel@...miny.me>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu
<mhiramat@...nel.org>, Dennis Zhou <dennis@...nel.org>,
Tejun Heo <tj@...nel.org>, Christoph Lameter <cl@...ux.com>,
Martin Liu <liumartin@...gle.com>, David Rientjes <rientjes@...gle.com>,
Jani Nikula <jani.nikula@...el.com>, Sweet Tea Dorminy
<sweettea@...gle.com>, Johannes Weiner <hannes@...xchg.org>,
Christian Brauner <brauner@...nel.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Suren Baghdasaryan <surenb@...gle.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Wei Yang <richard.weiyang@...il.com>, David Hildenbrand <david@...hat.com>,
Miaohe Lin <linmiaohe@...wei.com>, Al Viro <viro@...iv.linux.org.uk>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, Yu Zhao <yuzhao@...gle.com>,
Roman Gushchin <roman.gushchin@...ux.dev>, Greg Thelen <gthelen@...gle.com>
Subject: Re: [PATCH] mm: use per-numa-node atomics instead of percpu_counters
On 2025-03-26 19:36, Mateusz Guzik wrote:
[...]
>
> Hell, it may be your patch as is can be easily repurposed to
> decentralize the main percpu counter? I mean perhaps there is no need
> for any fancy hierarchical structure.
Here is an initial attempt at a design document for the hierarchical
percpu counters. Feedback welcome!
Split Counters With Binary Tree Approximation Propagation
=========================================================
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
March 27, 2025
* Propagation diagram when reaching batch size thresholds (± batch size):
Example diagram for 8 CPUs:
log2(8) = 3 levels
At each level, each pair propagates its values to the next level when
reaching the batch size thresholds.
Counters at levels 0, 1, 2 can be kept on a single byte (±128 range).
Counter at level 3 can be kept on a 32/64-bit counter.
Level 0: 0 1 2 3 4 5 6 7
| / | / | / | /
| / | / | / | /
| / | / | / | /
Level 1: 0 1 2 3
| / | /
| / | /
| / | /
Level 2: 0 1
| /
| /
| /
Level 3: 0
* Inaccuracy:
BATCH(level N): Level N batch size.
Example for BATCH(level 0) = 4
BATCH(level 0) = 4
BATCH(level 1) = 8
BATCH(level 2) = 16
BATCH(level N) = BATCH(level 0) * 2^N
per-counter global
inaccuracy inaccuracy
Level 0: ± 3 ± 24 (8 * 3)
Level 1: ± 7 ± 28 (4 * 7)
Level 2: ± 15 ± 30 (2 * 15)
Total: ------ ± 82 (log2(nr_cpus) * BATCH(level 0) * nr_cpus - Sum[0 .. log2(nr_cpus) - 1](nr_cpus / 2^n)
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists