[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250906032108.30539-1-ajgja@amazon.com>
Date: Sat, 6 Sep 2025 03:21:08 +0000
From: Andrew Guerrero <ajgja@...zon.com>
To: <cgroups@...r.kernel.org>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>
CC: <hannes@...xchg.org>, <mhocko@...nel.org>, <vdavydov.dev@...il.com>,
<akpm@...ux-foundation.org>, <shakeelb@...gle.com>, <guro@...com>,
<gunnarku@...zon.com>, <stable@...r.kernel.org>
Subject: [PATCH] mm: memcontrol: fix memcg accounting during cpu hotplug
A filesystem writeback performance issue was discovered by repeatedly
running CPU hotplug operations while a process in a cgroup with memory
and io controllers enabled wrote to an ext4 file in a loop.
When a CPU is offlined, the memcg_hotplug_cpu_dead() callback function
flushes per-cpu vmstats counters. However, instead of applying a per-cpu
counter once to each cgroup in the heirarchy, the per-cpu counter is
applied repeatedly just to the nested cgroup. Under certain conditions,
the per-cpu NR_FILE_DIRTY counter is routinely positive during hotplug
events and the dirty file count artifically inflates. Once the dirty
file count grows past the dirty_freerun_ceiling(), balance_dirty_pages()
starts a backgroup writeback each time a file page is marked dirty
within the nested cgroup.
This change fixes memcg_hotplug_cpu_dead() so that the per-cpu vmstats
and vmevents counters are applied once to each cgroup in the heirarchy,
similar to __mod_memcg_state() and __count_memcg_events().
Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
Signed-off-by: Andrew Guerrero <ajgja@...zon.com>
Reviewed-by: Gunnar Kudrjavets <gunnarku@...zon.com>
---
Hey all,
This patch is intended for the 5.10 longterm release branch. It will not apply
cleanly to mainline and is inadvertantly fixed by a larger series of changes in
later release branches:
a3d4c05a4474 ("mm: memcontrol: fix cpuhotplug statistics flushing").
In 5.15, the counter flushing code is completely removed. This may be another
viable option here too, though it's a larger change.
Thanks!
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 142b4d5e08fe..8e085a4f45b7 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2394,7 +2394,7 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu)
x = this_cpu_xchg(memcg->vmstats_percpu->stat[i], 0);
if (x)
for (mi = memcg; mi; mi = parent_mem_cgroup(mi))
- atomic_long_add(x, &memcg->vmstats[i]);
+ atomic_long_add(x, &mi->vmstats[i]);
if (i >= NR_VM_NODE_STAT_ITEMS)
continue;
@@ -2417,7 +2417,7 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu)
x = this_cpu_xchg(memcg->vmstats_percpu->events[i], 0);
if (x)
for (mi = memcg; mi; mi = parent_mem_cgroup(mi))
- atomic_long_add(x, &memcg->vmevents[i]);
+ atomic_long_add(x, &mi->vmevents[i]);
}
}
base-commit: c30b4019ea89633d790f0bfcbb03234f0d006f87
--
2.47.3
Powered by blists - more mailing lists