[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <edbe45b3-0f85-48d2-8d4f-784781a8dee5@efficios.com>
Date: Fri, 19 Dec 2025 10:14:18 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Joel Fernandes <joel@...lfernandes.org>,
"Paul E. McKenney" <paulmck@...nel.org>, linux-kernel@...r.kernel.org,
Nicholas Piggin <npiggin@...il.com>, Michael Ellerman <mpe@...erman.id.au>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Will Deacon <will@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Alan Stern <stern@...land.harvard.edu>, John Stultz <jstultz@...gle.com>,
Neeraj Upadhyay <Neeraj.Upadhyay@....com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Frederic Weisbecker <frederic@...nel.org>,
Josh Triplett <josh@...htriplett.org>, Uladzislau Rezki <urezki@...il.com>,
Steven Rostedt <rostedt@...dmis.org>, Lai Jiangshan
<jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>,
Ingo Molnar <mingo@...hat.com>, Waiman Long <longman@...hat.com>,
Mark Rutland <mark.rutland@....com>, Thomas Gleixner <tglx@...utronix.de>,
Vlastimil Babka <vbabka@...e.cz>, maged.michael@...il.com,
Mateusz Guzik <mjguzik@...il.com>,
Jonas Oberhauser <jonas.oberhauser@...weicloud.com>, rcu@...r.kernel.org,
linux-mm@...ck.org, lkmm@...ts.linux.dev
Subject: Re: [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers
On 2025-12-18 19:25, Boqun Feng wrote:
[...]
>> And if we pre-populate the 8 slots for each cpu, and thus force
>> fallback to overflow list:
>>
>> hazptr-smp-mb-8-fail 67.1 ns
>>
>
> Thank you! So involving locking seems to hurt performance more than
> per-CPU/per-task operations. This may suggest that enabling
> PREEMPT_HAZPTR by default has an acceptable performance.
Indeed, I can fold it into the hazptr patch and remove the config
option then.
>
>> So breaking up the iteration into pieces is not just to handle
>> busy-waiting, but also to make sure we don't increase the
>> system latency by holding a raw spinlock (taken with rq lock
>> held) for more than the little time needed to iterate to the next
>> node.
>>
>
> I agree that it helps reduce the latency, but I feel like with a scan
> thread in the picture (and we don't need to busy-wait), we should use
> a forward-progress-guaranteed way in the updater side scan, which means
> we may need to explore other solutions for the latency (e.g.
> fine-grained locking hashlist for the overflow list) than the generation
> counter.
Guaranteeing forward progress of synchronize even with a steady stream
of unrelated hazard pointers addition/removal to/from the overflow list
is something we should aim for, with or without a scan thread.
As is, my current generation scheme does not guarantee this. But we can
use liburcu RCU grace period "parity" concept as inspiration [1] and
introduce a two-lists scheme, and have hazptr_synchronize flip the
current "list_add" head while it iterates on the other list. There would
be one generation counter for each of the two lists.
This would be protected by holding a global mutex across
hazptr_synchronize. hazptr_synchronize would need to iterate
on the two lists one after the other, carefully flipping the
current "addition list" head between the two iterations.
So the worse case that can happen in terms of retry caused by
generation counter increments is if list entries are deleted while
the list is being traversed by hazptr_synchronize. Because there
are no possible concurrent additions to that list, the worse case
is that the list becomes empty, which bounds the number of retry
to the number of list elements.
Thoughts ?
Thanks,
Mathieu
[1] https://github.com/urcu/userspace-rcu/blob/master/src/urcu.c#L409
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists