linux-kernel - Re: [PATCH 6/9] rcu: rcu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <gqgocomsljkn5sw2jvm432peesjuclfjwd2sasec3kg6tofk5t@ry7rkoo2auek>
Date: Mon, 26 Aug 2024 13:09:15 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: rcu@...r.kernel.org, linux-kernel@...r.kernel.org, vbabka@...e.cz, 
	roman.gushchin@...ux.dev, 42.hyeyoo@...il.com, akpm@...ux-foundation.org, cl@...ux.com, 
	mhocko@...nel.org, urezki@...il.com, neeraj.upadhyay@...nel.org
Subject: Re: [PATCH 6/9] rcu: rcu_pending

On Mon, Aug 26, 2024 at 09:01:54AM GMT, Paul E. McKenney wrote:
> But get_state_synchronize_srcu() is still a function call.  But the
> offer of a function that checks multiple grace-period-state cookies in
> one call still stands.

It shouldn't be though, it's just reading a sequence number - I'd much
prefer if it could be at least a static inline.

Which you won't want to do, because it would mean exposing RCU private
data structures; hence my approach of exposing an RCU-only api for
getting a pointer to the sequence number.

> 
> > In my current version of the code, we call __get_state_synchronize_rcu()
> > after we've disabled interrupts; we know the rcu gp seq isn't going to
> > tick while we're in the critical path. But this doesn't apply if it's
> > for SRCU, and I don't want to add if (src) srcu_read_lock() branches to
> > it.
> 
> Actually, disabling interrupts does *not* prevent RCU's grace-period 
> sequence number from changing.  For example, the following really can
> happen:
> 
> o	RCU notices that the current grace period can end.
> 
> o	A CPU disables interrupts here.
> 
> o	A call to get_state_synchronize_rcu() returns a cookie
> 	corresponding to the end of the grace period following the
> 	current one.
> 
> o	RCU ends the current grace period, therefore updating the
> 	grace-period sequence number.
> 
> o	RCU starts a new grace period, therefore updating the
> 	grace-period sequence number once again.
> 
> o	RCU cannot complete this new grace period until that CPU
> 	re-enables interrupts, but it has already updated its grace-period
> 	sequence number not once, but twice.
> 
> So if your code knows that RCU's grace-period sequence number cannot
> change while a given CPU has interrupts disabled, that code has a bug.
> A low-probability bug, perhaps, but if your code is running on enough
> systems, it will make its presence known.

Ok, good to know

> 
> > Not at all essential, the races that result from this are harmless, but
> > if we e.g. decide it's worthwhile to only kick off a gp if it hasn't
> > ticked (i.e. only kick rcu if we're still on seq of the object being
> > enqueued) it'd be nice.
> 
> Why not call get_state_synchronize_rcu(), and ask for a new grace period
> if the value returned is different than that from the previous call?

We don't want to falsely think the object expires later than it actually
does, and have more accumulated work than we need to.

> > Funny, I had the same thoughts trying to read your code... :)
> 
> Amazing how much easier it is to generate new code than to understand
> existing code, isn't it?  ;-)

I would have much preferred if your existing code worked with SRCU. You
think I'm doing this for fun?

> > > Plus, unlike kfree_rcu() post-grace-period handling, call_rcu() callback
> > > functions usually access the memory block passed to them, which means
> > > that they are incurring that per-element cache miss in any case.
> > 
> > True. But this would allow us to prefetch those objects (several
> > iterations in advance).
> 
> I need to see a CPU on which this actually make a significant difference
> before adding this sort of complexity.

We would of course want benchmarks to show that this was worthwhile
before switching call_rcu(), since absent a performance improvement we'd
want to stick with the approach that doesn't allocate memory.

> > Just processing a few items? hmm, would we want to though, when
> > otherwise we'd be calling kfree_bulk()? I think icache locality means
> > we'd generally prefer not to.
> 
> You might not want to yet, but you eventually would want this.

Because?