lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <besg7pkhxa35fskaqcte2cplnkvr4nfpfivp6emc37ghkmdlmt@sdmuejz5u63d>
Date: Tue, 13 May 2025 11:09:30 -0700
From: Shakeel Butt <shakeel.butt@...ux.dev>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Andrew Morton <akpm@...ux-foundation.org>, 
	Johannes Weiner <hannes@...xchg.org>, Michal Hocko <mhocko@...nel.org>, 
	Roman Gushchin <roman.gushchin@...ux.dev>, Muchun Song <muchun.song@...ux.dev>, 
	Alexei Starovoitov <ast@...nel.org>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>, 
	Harry Yoo <harry.yoo@...cle.com>, Yosry Ahmed <yosry.ahmed@...ux.dev>, bpf@...r.kernel.org, 
	linux-mm@...ck.org, cgroups@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Meta kernel team <kernel-team@...a.com>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: Re: [RFC PATCH 1/7] memcg: memcg_rstat_updated re-entrant safe
 against irqs

On Tue, May 13, 2025 at 12:22:28PM +0200, Vlastimil Babka wrote:
> On 5/13/25 05:13, Shakeel Butt wrote:
> > The function memcg_rstat_updated() is used to track the memcg stats
> > updates for optimizing the flushes. At the moment, it is not re-entrant
> > safe and the callers disabled irqs before calling. However to achieve
> > the goal of updating memcg stats without irqs, memcg_rstat_updated()
> > needs to be re-entrant safe against irqs.
> > 
> > This patch makes memcg_rstat_updated() re-entrant safe against irqs.
> > However it is using atomic_* ops which on x86, adds lock prefix to the
> > instructions. Since this is per-cpu data, the this_cpu_* ops are
> > preferred. However the percpu pointer is stored in struct mem_cgroup and
> > doing the upward traversal through struct mem_cgroup may cause two cache
> > misses as compared to traversing through struct memcg_vmstats_percpu
> > pointer.
> > 
> > NOTE: explore if there is atomic_* ops alternative without lock prefix.
> 
> local_t might be what you want here
> https://docs.kernel.org/core-api/local_ops.html
> 
> Or maybe just add __percpu to parent like this?
> 
> struct memcg_vmstats_percpu {
> ...
>         struct memcg_vmstats_percpu __percpu *parent;
> ...
> }
> 
> Yes, it means on each cpu's struct memcg_vmstats_percpu instance there will
> be actually the same value stored (the percpu offset) instead of the
> cpu-specific parent pointer, which might seem wasteful. But AFAIK this_cpu_*
> is optimized enough thanks to the segment register usage, that it doesn't
> matter? It shouldn't cause any extra cache miss you worry about, IIUC?
> 
> With that I think you could refactor that code to use e.g.
> this_cpu_add_return() and this_cpu_xchg() on the stats_updates and obtain
> the parent "pointer" in a way that's also compatible with these operations.
> 

Thanks, I will try both of these and see which one looks better.

> That is unless we want also nmi safety, then we're back to the issue of the
> previous series...

Nah just irq safety for now and thanks a lot of quick feedback and
review.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ