linux-kernel - Re: [PATCH v2] locking/osq_lock: Optimize osq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <802f42ae-a49b-49e5-9cb3-53f26e57cbfb@redhat.com>
Date: Mon, 18 Mar 2024 10:58:11 -0400
From: Waiman Long <longman@...hat.com>
To: David Laight <David.Laight@...LAB.COM>, 'Guo Hui' <guohui@...ontech.com>,
 "peterz@...radead.org" <peterz@...radead.org>,
 "mingo@...hat.com" <mingo@...hat.com>, "will@...nel.org" <will@...nel.org>,
 "boqun.feng@...il.com" <boqun.feng@...il.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] locking/osq_lock: Optimize osq_lock performance using
 per-NUMA


On 3/18/24 05:47, David Laight wrote:
> From: Guo Hui
>> Sent: 18 March 2024 05:50
>>
>> Changes in version v1:
>> The queue is divided according to NUMA nodes,
>> but the tail of each NUMA node is still stored
>> in the structure optimistic_spin_queue.
> The description should be before any 'changes'.
> The changes between versions don't go into the commit message.
>
> Does this change affect a real workload, or just some benchmark?
>
> In reality you don't want a lot of threads waiting on a single
> lock (of any kind).
> So if a real workload is getting a long queue of waiters on
> an OSQ lock then the underlying code really needs fixing to
> 'not do that' (either by changing the way the lock is held
> or acquired).
That is true.
>
> The whole osq lock is actually quite strange.
> (I worked out how it all worked a while ago.)
> It is an ordered queue of threads waiting for the thread
> spinning on a mutex/rwlock to either obtain the mutex or
> to give up spinning and sleep.
> I suspect that the main benefit over spinning on the mutex
> itself is the fact that it is ordered.
> It also remove the 'herd of wildebeest' doing a cmpxchg - but
> one will win and the others do back to a non-locked poll.

The main benefit of doing spinning instead of sleeping is its 
elimination of the task wakeup latency. Think of it this way, the use of 
optimistic spinning is to make a mutex more like a spinlock if none  of 
the lock contenders are going to sleep.

The osq_lock code is to eliminate the lock cacheline bouncing and 
contention problem that hurts performance if there are many spinners. 
The ordering is nice from a fairness point of view, but that is not the 
main motivator for doing osq_lock.

>
> Are the gains you are seeing from the osq-lock code itself,
> or because the thread that ultimately holds the mutex is running
> on the same NUMA node as the previous thread than held the mutex?
>
> One thing I did notice is if the process holding the mutex
> sleeps there is no way to get all the osq spinners to
> sleep at once. They each obtain the osq-lock, realise the
> need to sleep, and release it in turn.
> That is going to take a while with a long queue.
That is true too.
>
> I didn't look at the mutex/rwlock code (I'm sure they
> could be a lot more common - a mutex is a rwlock that
> only has writers!) but if one thread detects that it
> needs to be pre-empted it takes itself out of the osq-lock
> and, presumably, sleeps on the mutex.
> Unless that stops any other threads being added to the osq-lock
> wont it get completely starved?

Both mutex and rwsem has a lock handoff mechanism to disable optimistic 
spinning to avoid lock starvation of sleeping waiters.

Cheers,
Longman