[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <660FFEAB399D5815+aa201e41-b902-3fa3-fb4a-333a166b9cd4@uniontech.com>
Date: Tue, 19 Mar 2024 09:46:25 +0800
From: Guo Hui <guohui@...ontech.com>
To: David Laight <David.Laight@...LAB.COM>,
"peterz@...radead.org" <peterz@...radead.org>,
"mingo@...hat.com" <mingo@...hat.com>, "will@...nel.org" <will@...nel.org>,
"longman@...hat.com" <longman@...hat.com>,
"boqun.feng@...il.com" <boqun.feng@...il.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] locking/osq_lock: Optimize osq_lock performance using
per-NUMA
On 3/18/24 5:47 PM, David Laight wrote:
> From: Guo Hui
>> Sent: 18 March 2024 05:50
>>
>> Changes in version v1:
>> The queue is divided according to NUMA nodes,
>> but the tail of each NUMA node is still stored
>> in the structure optimistic_spin_queue.
> The description should be before any 'changes'.
> The changes between versions don't go into the commit message.
>
> Does this change affect a real workload, or just some benchmark?
>
> In reality you don't want a lot of threads waiting on a single
> lock (of any kind).
> So if a real workload is getting a long queue of waiters on
> an OSQ lock then the underlying code really needs fixing to
> 'not do that' (either by changing the way the lock is held
> or acquired).
>
> The whole osq lock is actually quite strange.
> (I worked out how it all worked a while ago.)
> It is an ordered queue of threads waiting for the thread
> spinning on a mutex/rwlock to either obtain the mutex or
> to give up spinning and sleep.
> I suspect that the main benefit over spinning on the mutex
> itself is the fact that it is ordered.
> It also remove the 'herd of wildebeest' doing a cmpxchg - but
> one will win and the others do back to a non-locked poll.
>
> Are the gains you are seeing from the osq-lock code itself,
> or because the thread that ultimately holds the mutex is running
> on the same NUMA node as the previous thread than held the mutex?
This is because the thread that ultimately holds the mutex is running on
the same NUMA node as the previous thread than held the mutex.
Powered by blists - more mailing lists