lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b045dc42-233f-4bb9-8619-6a688c05b7ae@nvidia.com>
Date: Thu, 8 Jan 2026 18:34:49 -0500
From: Joel Fernandes <joelagnelf@...dia.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Frederic Weisbecker <frederic@...nel.org>
Cc: Boqun Feng <boqun.feng@...il.com>, Joel Fernandes
 <joel@...lfernandes.org>, "Paul E. McKenney" <paulmck@...nel.org>,
 linux-kernel@...r.kernel.org, Nicholas Piggin <npiggin@...il.com>,
 Michael Ellerman <mpe@...erman.id.au>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Will Deacon <will@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
 Alan Stern <stern@...land.harvard.edu>, John Stultz <jstultz@...gle.com>,
 Neeraj Upadhyay <Neeraj.Upadhyay@....com>,
 Linus Torvalds <torvalds@...ux-foundation.org>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Josh Triplett <josh@...htriplett.org>, Uladzislau Rezki <urezki@...il.com>,
 Steven Rostedt <rostedt@...dmis.org>, Lai Jiangshan
 <jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>,
 Ingo Molnar <mingo@...hat.com>, Waiman Long <longman@...hat.com>,
 Mark Rutland <mark.rutland@....com>, Thomas Gleixner <tglx@...utronix.de>,
 Vlastimil Babka <vbabka@...e.cz>, maged.michael@...il.com,
 Mateusz Guzik <mjguzik@...il.com>,
 Jonas Oberhauser <jonas.oberhauser@...weicloud.com>, rcu@...r.kernel.org,
 linux-mm@...ck.org, lkmm@...ts.linux.dev
Subject: Re: [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers



On 1/8/2026 11:45 AM, Mathieu Desnoyers wrote:
> On 2026-01-08 11:34, Frederic Weisbecker wrote:
>> Le Fri, Dec 19, 2025 at 09:22:19AM -0500, Mathieu Desnoyers a écrit :
>>> On 2025-12-18 19:43, Boqun Feng wrote:
>>>> On Thu, Dec 18, 2025 at 12:35:18PM -0500, Mathieu Desnoyers wrote:
>>>> [...]
>>>>>> Could you utilize this[1] to see a
>>>>>> comparison of the reader-side performance against RCU/SRCU?
>>>>>
>>>>> Good point ! Let's see.
>>>>>
>>>>> On a AMD 2x EPYC 9654 96-Core Processor with 192 cores,
>>>>> hyperthreading disabled,
>>>>> CONFIG_PREEMPT=y,
>>>>> CONFIG_PREEMPT_RCU=y,
>>>>> CONFIG_PREEMPT_HAZPTR=y.
>>>>>
>>>>> scale_type                 ns
>>>>> -----------------------
>>>>> hazptr-smp-mb             13.1   <- this implementation
>>>>> hazptr-barrier            11.5   <- replace smp_mb() on acquire with
>>>>> barrier(), requires IPIs on synchronize.
>>>>> hazptr-smp-mb-hlist       12.7   <- replace per-task hp context and per-cpu
>>>>> overflow lists by hlist.
>>>>> rcu                       17.0
>>>>
>>>> Hmm.. now looking back, how is it possible that hazptr is faster than
>>>> RCU on the reader-side? Because a grace period was happening and
>>>> triggered rcu_read_unlock_special()? This is actualy more interesting.
>>> So I could be entirely misreading the code, but, we have:
>>>
>>> rcu_flavor_sched_clock_irq():
>>> [...]
>>>          /* If GP is oldish, ask for help from rcu_read_unlock_special(). */
>>>          if (rcu_preempt_depth() > 0 &&
>>>              __this_cpu_read(rcu_data.core_needs_qs) &&
>>>              __this_cpu_read(rcu_data.cpu_no_qs.b.norm) &&
>>>              !t->rcu_read_unlock_special.b.need_qs &&
>>>              time_after(jiffies, rcu_state.gp_start + HZ))
>>>                  t->rcu_read_unlock_special.b.need_qs = true;
>>>
>>> which means we set need_qs = true as a result from observing
>>> cpu_no_qs.b.norm == true.
>>>
>>> This is sufficient to trigger calls (plural) to rcu_read_unlock_special()
>>> from __rcu_read_unlock.
>>>
>>> But then if we look at rcu_preempt_deferred_qs_irqrestore()
>>> which we would expect to clear the rcu_read_unlock_special.b.need_qs
>>> state, we have this:
>>>
>>>          special = t->rcu_read_unlock_special;
>>>          if (!special.s && !rdp->cpu_no_qs.b.exp) {
>>>                  local_irq_restore(flags);
>>>                  return;
>>>          }
>>>          t->rcu_read_unlock_special.s = 0;
>>>
>>> which skips over clearing the state unless there is an expedited
>>> grace period required.
>>>
>>> So unless I'm missing something, we should _also_ clear that state
>>> when it's invoked after rcu_flavor_sched_clock_irq, so the next
>>> __rcu_read_unlock won't all call into rcu_read_unlock_special().
>>>
>>> I'm adding a big warning about sleep deprivation and possibly
>>> misunderstanding the whole thing. What am I missing ?
>>
>> As far as I can tell, this skips clearing the state if the state is
>> already cleared. Or am I even more sleep deprived than you? :o)
> 
> No, you are right. The (!x && !y) pattern confused me, but the
> code is correct. Good thing I've put a warning about sleep
> deprivation. ;-)
> 
> Sorry for the noise.

Right, I think this can happen when after a rcu_flavor_sched_clock_irq() set
special.b.need_qs, then another upcoming rcu_flavor_sched_clock_irq() raced with
reader's rcu_read_unlock() and interrupted rcu_read_unlock_special() before it
could disable interrupts.

rcu_read_unlock()
 -> rcu_read_lock_nesting--;
  -> nesting == 0 and special is set.

   <interrupted by sched clock>
      -> rcu_flavor_sched_clock_irq()
         -> rcu_preempt_deferred_qs_irqrestore
            -> clear b.special
   <interrupt returned>

     -> rcu_read_unlock_special()
       -> local_irq_save(flags);  // too late
          -> rcu_preempt_deferred_qs_irqrestore
             -> Early return.

thanks,

 - Joel


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ