[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <613e258c-e74b-46dc-970b-de9c78672c76@efficios.com>
Date: Fri, 11 Oct 2024 11:15:23 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
paulmck <paulmck@...nel.org>, linux-kernel <linux-kernel@...r.kernel.org>,
Maged Michael <maged.michael@...il.com>
Subject: Re: [RFC] Synchronized Shared Pointers for the Linux kernel
On 2024-10-11 01:11, Boqun Feng wrote:
> On Thu, Oct 10, 2024 at 03:16:25PM -0400, Mathieu Desnoyers wrote:
>> Hi,
>>
>> I've created a new API (sharedptr.h) for the use-case of
>> providing existence object guarantees (e.g. for Rust)
>> when dereferencing pointers which can be concurrently updated.
>> I call this "Synchronized Shared Pointers".
>>
>> This should be an elegant solution to Greg's refcount
>> existence use-case as well.
>>
>> The current implementation can be found here:
>>
>> https://github.com/compudj/linux-dev/commit/64c3756b88776fe534629c70f6a1d27fad27e9ba
>>
>> Patch added inline below for feedback.
>>
>> Thanks!
>>
>> Mathieu
>>
[...]
>> + */
>> +static inline
>> +struct sharedptr sharedptr_copy_from_sync(const struct syncsharedptr *ssp)
>> +{
>> + struct sharedptr_node *spn, *hp;
>> + struct hazptr_slot *slot;
>> + struct sharedptr sp;
>> +
>> + preempt_disable();
>
> Disabling preemption acts as an RCU read-side critical section, so I
> guess the immediate question is why (or when) not use RCU ;-)
That's a very relevant question indeed! Why use hazard pointers rather
than RCU in this particular use-case ?
You are right that I could add a rcu_read_lock()/rcu_read_unlock()
around sharedptr_copy_from_sync(), and pair this with a call_rcu()
in node_release, which would effectively replace hazard pointers
by RCU.
Please keep in mind that the current implementation of this API
is minimalist. I mean to extend this, and this is where the benefits
of hazard pointers over RCU should become clearer:
1) With hazard pointers, AFAIU we can modify the sharedptr_delete
and syncsharedptr_delete to issue hazptr_scan() _before_
decrementing to 0, e.g.
unsigned int old, new;
WRITE_ONCE(sp->spn, NULL);
old = spn->refcount;
do {
new = old - 1;
if (!new)
hazptr_scan(&hazptr_domain_sharedptr, spn, NULL);
} while (!atomic_try_cmpxchg(&spn->refcount->refs, &old, new);
if (!new)
sharedptr_node_release(spn);
And therefore modify sharedptr_copy_from_sync to use a refcount_inc
rather than a refcount_inc_not_zero, because the reference count
can never be 0 while a hazard pointer to the object exists.
This modification would make hazard pointers act as if they
*are* holding a reference count on the object.
This improvement brings us to a more important benefit:
2) If we increase the number of available hazptr slots per cpu
(e.g. to 8 or more), then we can start using hazard pointers as
reference counter replacement (fast-path).
This will allow introducing a new type of sharedptr object, which
could be named "thread sharedptr", meant to hold a reference for a
relatively short period of time, on a single thread, where the thread
is still allowed to be preempted or block while holding the thread
sharedptr.
The per-cpu hazard pointer slots would point to per-thread sharedptr
structures, which would hold the protected hazard pointer slot.
When the application means to keep the reference for longer,
to store it into a data structure or pass it around to another
thread, then it should copy the thread-sharedptr to a normal
sharedptr, which will make sure a reference is taken on the
object.
The thread sharedptr is tied to the thread using it. Because the
hazard pointer context would be in a well-defined per-thread area
(rather than just on the stack), we can do the following when
scanning for hazard pointers: force the thread sharedptr to promote
to a reference counter increment on the object, thus allowing the
hazard pointer scan to progress. This allows freeing up the per-cpu
slot immediately.
If all per-cpu hazard pointer slots are used, the thread sharedptr
would automatically fall-back to reference counter.
We could even add a per-cpu timer which would track how old are each
per-CPU hazard pointer slots, and promote them to a reference counter
increment based on their age if needed.
This would effectively allow implementing a "per-thread shared pointer"
fast-path, which would scale better than just reference counters on large
multi-core systems.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists