lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2ca589bc-2a60-48ef-95a0-d9ee4a814dea@efficios.com>
Date: Sat, 28 Sep 2024 07:33:37 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>,
 Boqun Feng <boqun.feng@...il.com>, "Paul E. McKenney" <paulmck@...nel.org>
Cc: linux-kernel@...r.kernel.org, Will Deacon <will@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>, Alan Stern
 <stern@...land.harvard.edu>, John Stultz <jstultz@...gle.com>,
 Linus Torvalds <torvalds@...ux-foundation.org>,
 Frederic Weisbecker <frederic@...nel.org>,
 Joel Fernandes <joel@...lfernandes.org>,
 Josh Triplett <josh@...htriplett.org>, Uladzislau Rezki <urezki@...il.com>,
 Steven Rostedt <rostedt@...dmis.org>, Lai Jiangshan
 <jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>,
 Ingo Molnar <mingo@...hat.com>, Waiman Long <longman@...hat.com>,
 Mark Rutland <mark.rutland@....com>, Thomas Gleixner <tglx@...utronix.de>,
 Vlastimil Babka <vbabka@...e.cz>, maged.michael@...il.com,
 Mateusz Guzik <mjguzik@...il.com>, rcu@...r.kernel.org, linux-mm@...ck.org,
 lkmm@...ts.linux.dev
Subject: Re: [RFC PATCH 1/1] hpref: Hazard Pointers with Reference Counter

On 2024-09-28 13:22, Jonas Oberhauser wrote:
> Two more questions below:
> 
> Am 9/21/2024 um 6:42 PM schrieb Mathieu Desnoyers:
>> +#define NR_PERCPU_SLOTS_BITS    3
> 
> Have you measured any advantage of this multi-slot version vs a version 
> with just one normal slot and one emergency slot?

No, I have not. That being said, I am taking a minimalistic
approach that takes things even further in the "simple" direction
for what I will send as RFC against the Linux kernel:
there is just the one "emergency slot", irqs are disabled around
use of the HP slot, and promotion to refcount is done before
returning to the caller.

> With just one normal slot, the normal slot version would always be zero, 
> and there'd be no need to increment etc., which might make the common 
> case (no conflict) faster.

The multi-slots allows preemption while holding the slot. It also allows
HP slots users to keep it longer without doing to refcount right away.

I even have a patch that dynamically adapts the scan depth (increase
by reader, decrease by synchronize, with an hysteresis) in my userspace
prototype. This splits the number of allocated slots from the scan
depth. But I keep that for later and will focus on the simple case
first (single HP slot, only used with irqoff).

> 
> Either way I recommend stress testing with just one normal slot to 
> increase the chance of conflict (and hence triggering corner cases) 
> during stress testing.

Good point.

> 
>> +retry:
>> +    node = uatomic_load(node_p, CMM_RELAXED);
>> +    if (!node)
>> +        return false;
>> +    /* Use rseq to try setting current slot hp. Store B. */
>> +    if (rseq_load_cbne_store__ptr(RSEQ_MO_RELAXED, RSEQ_PERCPU_CPU_ID,
>> +                (intptr_t *) &slot->node, (intptr_t) NULL,
>> +                (intptr_t) node, cpu)) {
>> +        slot = &cpu_slots->slots[HPREF_EMERGENCY_SLOT];
>> +        use_refcount = true;
>> +        /*
>> +         * This may busy-wait for another reader using the
>> +         * emergency slot to transition to refcount.
>> +         */
>> +        caa_cpu_relax();
>> +        goto retry;
>> +    }
> 
> I'm not familiar with Linux' preemption model. Can this deadlock if a 
> low-interrupt-level thread is occupying the EMERGENCY slot and a 
> higher-interrupt-level thread is also trying to take it?

This is a userspace prototype. This will behave similarly to a userspace
spinlock in that case, which is not great in terms of CPU usage, but
should eventually unblock the waiter, unless it has a RT priority that
really prevents any progress from the emergency slot owner.

On my TODO list, I have a bullet about integrating with sys_futex to
block on wait, wake up on slot release. I would then use the wait/wakeup
code based on sys_futex already present in liburcu.

Thanks!

Mathieu


> 
> 
> 
> 
> Best wishes,
>    jonas
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ