[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ykbwsq7xckhjaeoe6ba7tqm55vxrth74tmep4ey7feui3lblcf@vt43elwkqqf7>
Date: Thu, 26 Jun 2025 12:15:53 -0700
From: Shakeel Butt <shakeel.butt@...ux.dev>
To: Bertrand Wlodarczyk <bertrand.wlodarczyk@...el.com>
Cc: tj@...nel.org, hannes@...xchg.org, mkoutny@...e.com,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org, inwardvessel@...il.com
Subject: Re: [PATCH v2] cgroup/rstat: change cgroup_base_stat to atomic
On Tue, Jun 24, 2025 at 04:45:58PM +0200, Bertrand Wlodarczyk wrote:
> The kernel faces scalability issues when multiple userspace
> programs attempt to read cgroup statistics concurrently.
>
> The primary bottleneck is the css_cgroup_lock in cgroup_rstat_flush,
> which prevents access and updates to the statistics
> of the css from multiple CPUs in parallel.
>
> Given that rstat operates on a per-CPU basis and only aggregates
> statistics in the parent cgroup, there is no compelling reason
> why these statistics cannot be atomic.
> By eliminating the lock during CPU statistics access,
> each CPU can traverse its rstat hierarchy independently, without blocking.
> Synchronization is achieved during parent propagation through
> atomic operations.
>
> This change significantly enhances performance on commit
> 8dcb0ed834a3ec03 ("memcg: cgroup: call css_rstat_updated irrespective of in_nmi()")
> in scenarios where multiple CPUs accessCPU rstat within a
> single cgroup hierarchy, yielding a performance improvement of around 40 times.
> Notably, performance for memory and I/O rstats remains unchanged,
> as the lock remains in place for these usages.
>
> Additionally, this patch addresses a race condition detectable
> in the current mainline by KCSAN in __cgroup_account_cputime,
> which occurs when attempting to read a single hierarchy
> from multiple CPUs.
>
> Signed-off-by: Bertrand Wlodarczyk <bertrand.wlodarczyk@...el.com>
This patch breaks memory controller as explained in the comments on the
previous version. Also the response to the tearing issue explained by JP
is not satisfying.
Please run scripts/faddr2line on css_rstat_flush+0x1b0/0xed0 and
css_rstat_updated+0x8f/0x1a0 to see which field is causing the race.
Powered by blists - more mailing lists