linux-kernel - Re: [RFC PATCH v2 3/4] hp: Implement Hazard Pointers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241007104030.GB4879@noisy.programming.kicks-ass.net>
Date: Mon, 7 Oct 2024 12:40:30 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Boqun Feng <boqun.feng@...il.com>, linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Nicholas Piggin <npiggin@...il.com>,
	Michael Ellerman <mpe@...erman.id.au>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	"Paul E. McKenney" <paulmck@...nel.org>,
	Will Deacon <will@...nel.org>,
	Alan Stern <stern@...land.harvard.edu>,
	John Stultz <jstultz@...gle.com>,
	Neeraj Upadhyay <Neeraj.Upadhyay@....com>,
	Frederic Weisbecker <frederic@...nel.org>,
	Joel Fernandes <joel@...lfernandes.org>,
	Josh Triplett <josh@...htriplett.org>,
	Uladzislau Rezki <urezki@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Zqiang <qiang.zhang1211@...il.com>, Ingo Molnar <mingo@...hat.com>,
	Waiman Long <longman@...hat.com>,
	Mark Rutland <mark.rutland@....com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Vlastimil Babka <vbabka@...e.cz>, maged.michael@...il.com,
	Mateusz Guzik <mjguzik@...il.com>,
	Jonas Oberhauser <jonas.oberhauser@...weicloud.com>,
	rcu@...r.kernel.org, linux-mm@...ck.org, lkmm@...ts.linux.dev
Subject: Re: [RFC PATCH v2 3/4] hp: Implement Hazard Pointers

On Sat, Oct 05, 2024 at 02:50:17PM -0400, Mathieu Desnoyers wrote:
> On 2024-10-05 18:04, Peter Zijlstra wrote:


> > > +/*
> > > + * hp_allocate: Allocate a hazard pointer.
> > > + *
> > > + * Allocate a hazard pointer slot for @addr. The object existence should
> > > + * be guaranteed by the caller. Expects to be called from preempt
> > > + * disable context.
> > > + *
> > > + * Returns a hazard pointer context.
> > 
> > So you made the WTF'o'meter crack, this here function does not allocate
> > nothing. Naming is bad. At best this is something like
> > try-set-hazard-pointer or somesuch.
> 
> I went with the naming from the 2004 paper from Maged Michael, but I
> agree it could be clearer.
> 
> I'm tempted to go for "hp_try_post()" and "hp_remove()", basically
> "posting" the intent to use a pointer (as in on a metaphorical billboard),
> and removing it when it's done.

For RCU we've taken to using the word: 'publish', no?


> > > +/*
> > > + * hp_dereference_allocate: Dereference and allocate a hazard pointer.
> > > + *
> > > + * Returns a hazard pointer context. Expects to be called from preempt
> > > + * disable context.
> > > + */
> > 
> > More terrible naming. Same as above, but additionally, I would expect a
> > 'dereference' to actually dereference the pointer and have a return
> > value of the dereferenced type.
> 
> hp_dereference_try_post() ?
> 
> > 
> > This function seems to double check and update the hp_ctx thing. I'm not
> > at all sure yet wtf this is doing -- and the total lack of comments
> > aren't helping.
> 
> The hp_ctx contains the outputs.
> 
> The function loads *addr_p to then try_post it into a HP slot. On success,
> it re-reads the *addr_p (with address dependency) and if it still matches,
> use that as output address pointer.
> 
> I'm planning to remove hp_ctx, and just have:
> 
> /*
>  * hp_try_post: Try to post a hazard pointer.
>  *
>  * Post a hazard pointer slot for @addr. The object existence should
>  * be guaranteed by the caller. Expects to be called from preempt
>  * disable context.
>  *
>  * Returns true if post succeeds, false otherwise.
>  */
> static inline
> bool hp_try_post(struct hp_domain *hp_domain, void *addr, struct hp_slot **_slot)
> [...]
> 
> /*
>  * hp_dereference_try_post: Dereference and try to post a hazard pointer.
>  *
>  * Returns a hazard pointer context. Expects to be called from preempt
>  * disable context.
>  */
> static inline
> void *__hp_dereference_try_post(struct hp_domain *hp_domain,
>                                 void * const * addr_p, struct hp_slot **_slot)
> [...]
> 
> #define hp_dereference_try_post(domain, p, slot_p)              \
>         ((__typeof__(*(p))) __hp_dereference_try_post(domain, (void * const *) p, slot_p))

This will compile, but do the wrong thing when p is a regular pointer, no?

> 
> /* Clear the hazard pointer in @slot. */
> static inline
> void hp_remove(struct hp_slot *slot)
> [...]

Differently weird, but better I suppose :-)


> > > +void hp_scan(struct hp_slot __percpu *percpu_slots, void *addr,
> > > +	     void (*retire_cb)(int cpu, struct hp_slot *slot, void *addr))
> > > +{
> > > +	int cpu;
> > > +
> > > +	/*
> > > +	 * Store A precedes hp_scan(): it unpublishes addr (sets it to
> > > +	 * NULL or to a different value), and thus hides it from hazard
> > > +	 * pointer readers.
> > > +	 */
> > > +
> > > +	if (!addr)
> > > +		return;
> > > +	/* Memory ordering: Store A before Load B. */
> > > +	smp_mb();
> > > +	/* Scan all CPUs slots. */
> > > +	for_each_possible_cpu(cpu) {
> > > +		struct hp_slot *slot = per_cpu_ptr(percpu_slots, cpu);
> > > +
> > > +		if (retire_cb && smp_load_acquire(&slot->addr) == addr)	/* Load B */
> > > +			retire_cb(cpu, slot, addr);
> > 
> > Is retirce_cb allowed to cmpxchg the thing?
> 
> It could, but we'd need to make sure the slot is not re-used by another
> hp_try_post() before the current user removes its own post. It would
> need to synchronize with the current HP user (e.g. with IPI).
> 
> I've actually renamed retire_cb to "on_match_cb".

Hmm, I think I see. Would it make sense to pass the expected addr to
hp_remove() and double check we don't NULL out something unexpected? --
maybe just for a DEBUG option.

I'm always seeing the NOHZ_FULL guys hating on this :-)