linux-kernel - Re: [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6353feeb-c2ab-4ff6-9ea6-04ae5102641d@nvidia.com>
Date: Fri, 19 Dec 2025 10:42:56 -0500
From: Joel Fernandes <joelagnelf@...dia.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Boqun Feng <boqun.feng@...il.com>
Cc: Joel Fernandes <joel@...lfernandes.org>,
 "Paul E. McKenney" <paulmck@...nel.org>, linux-kernel@...r.kernel.org,
 Nicholas Piggin <npiggin@...il.com>, Michael Ellerman <mpe@...erman.id.au>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Will Deacon <will@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
 Alan Stern <stern@...land.harvard.edu>, John Stultz <jstultz@...gle.com>,
 Neeraj Upadhyay <Neeraj.Upadhyay@....com>,
 Linus Torvalds <torvalds@...ux-foundation.org>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Frederic Weisbecker <frederic@...nel.org>,
 Josh Triplett <josh@...htriplett.org>, Uladzislau Rezki <urezki@...il.com>,
 Steven Rostedt <rostedt@...dmis.org>, Lai Jiangshan
 <jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>,
 Ingo Molnar <mingo@...hat.com>, Waiman Long <longman@...hat.com>,
 Mark Rutland <mark.rutland@....com>, Thomas Gleixner <tglx@...utronix.de>,
 Vlastimil Babka <vbabka@...e.cz>, maged.michael@...il.com,
 Mateusz Guzik <mjguzik@...il.com>,
 Jonas Oberhauser <jonas.oberhauser@...weicloud.com>, rcu@...r.kernel.org,
 linux-mm@...ck.org, lkmm@...ts.linux.dev
Subject: Re: [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers

On 12/19/2025 10:14 AM, Mathieu Desnoyers wrote:
> On 2025-12-18 19:25, Boqun Feng wrote:
> [...]
>>> And if we pre-populate the 8 slots for each cpu, and thus force
>>> fallback to overflow list:
>>>
>>> hazptr-smp-mb-8-fail    67.1 ns
>>>
>>
>> Thank you! So involving locking seems to hurt performance more than
>> per-CPU/per-task operations. This may suggest that enabling
>> PREEMPT_HAZPTR by default has an acceptable performance.
> 
> Indeed, I can fold it into the hazptr patch and remove the config
> option then.
> 
>>
>>> So breaking up the iteration into pieces is not just to handle
>>> busy-waiting, but also to make sure we don't increase the
>>> system latency by holding a raw spinlock (taken with rq lock
>>> held) for more than the little time needed to iterate to the next
>>> node.
>>>
>>
>> I agree that it helps reduce the latency, but I feel like with a scan
>> thread in the picture (and we don't need to busy-wait), we should use
>> a forward-progress-guaranteed way in the updater side scan, which means
>> we may need to explore other solutions for the latency (e.g.
>> fine-grained locking hashlist for the overflow list) than the generation
>> counter.
> 
> Guaranteeing forward progress of synchronize even with a steady stream
> of unrelated hazard pointers addition/removal to/from the overflow list
> is something we should aim for, with or without a scan thread.
> 
> As is, my current generation scheme does not guarantee this. But we can
> use liburcu RCU grace period "parity" concept as inspiration [1] and
> introduce a two-lists scheme, and have hazptr_synchronize flip the
> current "list_add" head while it iterates on the other list. There would
> be one generation counter for each of the two lists.
> 
> This would be protected by holding a global mutex across
> hazptr_synchronize. hazptr_synchronize would need to iterate
> on the two lists one after the other, carefully flipping the
> current "addition list" head between the two iterations.
> 
> So the worse case that can happen in terms of retry caused by
> generation counter increments is if list entries are deleted while
> the list is being traversed by hazptr_synchronize. Because there
> are no possible concurrent additions to that list, the worse case
> is that the list becomes empty, which bounds the number of retry
> to the number of list elements.
> 
> Thoughts ?

IMHO the overflow case is "special" and should not happen often, otherwise
things are "bad" anyway. I am not sure if this kind of complexity will be worth
it unless we know HP forward-progress is a real problem. Also, since HP acquire
will be short lived, are we that likely to not get past a temporary shortage of
slots?

Perhaps the forward-progress problem should be rephrased to the following?: If a
reader hit an overflow slot, it should probably be able to get a non-overflow
slot soon, even if hazard pointer slots are over-subscribed.

Thanks.