lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aBnZMBJ-OOEXvpUa@google.com>
Date: Tue, 6 May 2025 09:41:04 +0000
From: Yosry Ahmed <yosry.ahmed@...ux.dev>
To: Shakeel Butt <shakeel.butt@...ux.dev>
Cc: Tejun Heo <tj@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>,
	Alexei Starovoitov <ast@...nel.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Michal Hocko <mhocko@...nel.org>,
	Roman Gushchin <roman.gushchin@...ux.dev>,
	Muchun Song <muchun.song@...ux.dev>,
	Michal Koutný <mkoutny@...e.com>,
	Vlastimil Babka <vbabka@...e.cz>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	JP Kobryn <inwardvessel@...il.com>, bpf@...r.kernel.org,
	linux-mm@...ck.org, cgroups@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Meta kernel team <kernel-team@...a.com>
Subject: Re: [RFC PATCH 3/3] cgroup: make css_rstat_updated nmi safe

On Thu, May 01, 2025 at 03:10:20PM -0700, Shakeel Butt wrote:
> On Wed, Apr 30, 2025 at 06:14:28AM -0700, Yosry Ahmed wrote:
> [...]
> > > +
> > > +	if (!_css_rstat_cpu_trylock(css, cpu, &flags)) {
> > 
> > 
> > IIUC this trylock will only fail if a BPF program runs in NMI context
> > and tries to update cgroup stats, interrupting a context that is already
> > holding the lock (i.e. updating or flushing stats).
> > 
> 
> Correct (though note that flushing side can be on a different CPU).
> 
> > How often does this happen in practice tho? Is it worth the complexity?
> 
> This is about correctness, so even a chance of occurance need the
> solution.

Right, my question was more about the need to special case NMIs, see
below.

> 
> > 
> > I wonder if it's better if we make css_rstat_updated() inherently
> > lockless instead.
> > 
> > What if css_rstat_updated() always just adds to a lockless tree,
> 
> Here I assume you meant lockless list instead of tree.

Yeah, in a sense. I meant using lockless lists to implement the rstat
tree instead of normal linked lists.

> 
> > and we
> > defer constructing the proper tree to the flushing side? This should
> > make updates generally faster and avoids locking or disabling interrupts
> > in the fast path. We essentially push more work to the flushing side.
> > 
> > We may be able to consolidate some of the code too if all the logic
> > manipulating the tree is on the flushing side.
> > 
> > WDYT? Am I missing something here?
> > 
> 
> Yes this can be done but I don't think we need to tie that to current
> series. I think we can start with lockless in the nmi context and then
> iteratively make css_rstat_updated() lockless for all contexts.

My question is basically whether it would be simpler to actually make it
all lockless than special casing NMIs. With this patch we have two
different paths and a deferred list that we process at a later point. I
think it may be simpler if we just make it all lockless to begin with.
Then we would have a single path and no special deferred processing.

WDYT?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ