lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a7e6aeb8-9fb1-4570-845c-3ce9946190a2@gmail.com>
Date: Fri, 27 Jun 2025 09:55:54 -0700
From: JP Kobryn <inwardvessel@...il.com>
To: "Wlodarczyk, Bertrand" <bertrand.wlodarczyk@...el.com>,
 Shakeel Butt <shakeel.butt@...ux.dev>
Cc: "tj@...nel.org" <tj@...nel.org>, "hannes@...xchg.org"
 <hannes@...xchg.org>, "mkoutny@...e.com" <mkoutny@...e.com>,
 "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] cgroup/rstat: change cgroup_base_stat to atomic

On 6/27/25 6:15 AM, Wlodarczyk, Bertrand wrote:
>> The kernel faces scalability issues when multiple userspace programs
>> attempt to read cgroup statistics concurrently.
>>
>> The primary bottleneck is the css_cgroup_lock in cgroup_rstat_flush,
>> which prevents access and updates to the statistics of the css from
>> multiple CPUs in parallel.
>>
>> Given that rstat operates on a per-CPU basis and only aggregates
>> statistics in the parent cgroup, there is no compelling reason why
>> these statistics cannot be atomic.
>> By eliminating the lock during CPU statistics access, each CPU can
>> traverse its rstat hierarchy independently, without blocking.
>> Synchronization is achieved during parent propagation through atomic
>> operations.
>>
>> This change significantly enhances performance on commit
>> 8dcb0ed834a3ec03 ("memcg: cgroup: call css_rstat_updated irrespective
>> of in_nmi()") in scenarios where multiple CPUs accessCPU rstat within
>> a single cgroup hierarchy, yielding a performance improvement of around 40 times.
>> Notably, performance for memory and I/O rstats remains unchanged, as
>> the lock remains in place for these usages.
>>
>> Additionally, this patch addresses a race condition detectable in the
>> current mainline by KCSAN in __cgroup_account_cputime, which occurs
>> when attempting to read a single hierarchy from multiple CPUs.
>>
>> Signed-off-by: Bertrand Wlodarczyk <bertrand.wlodarczyk@...el.com>
> 
>> This patch breaks memory controller as explained in the comments on the previous version.
> 
> Ekhm... no? I addressed the issue and v2 has lock back and surrounding the call to dependent submodules?
> The behavior is the same as before patching.
> 
> In the long term, in my opinion, the atomics should happen also in dependent submodules to eliminate locks
> completely.
> 
>   > Also the response to the tearing issue explained by JP is not satisfying.
> 
> In other words, the claim is: "it's better to stall other cpus in spinlock plus disable IRQ every time in order to
> serve outdated snapshot instead of providing user to the freshest statistics much, much faster".

But they're not really "outdated" are they? Regardless of the wait, once
the lock is acquired they will get the latest snapshot.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ