[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <afddc02d-fa96-4793-bc60-6c81629d8d5b@redhat.com>
Date: Sun, 15 Sep 2024 17:06:10 -0400
From: Waiman Long <longman@...hat.com>
To: Uros Bizjak <ubizjak@...il.com>, yongli-oc <yongli-oc@...oxin.com>,
peterz@...radead.org, mingo@...hat.com, will@...nel.org, boqun.feng@...il.com
Cc: linux-kernel@...r.kernel.org, yongli@...oxin.com, louisqi@...oxin.com,
cobechen@...oxin.com, jiangbowang@...oxin.com
Subject: Re: [PATCH 3/4] locking/osq_lock: From x_osq_lock/unlock to
numa-aware lock/unlock.
On 9/15/24 14:44, Uros Bizjak wrote:
>
>
> On 14. 09. 24 10:53, yongli-oc wrote:
>> According to the contention level, switches from x_osq_lock to
>> numa-aware osq_lock.
>> The numa-aware lock is a two level osq_lock.
>> The Makefile for dynamic numa-aware osq lock.
>>
>> Signed-off-by: yongli-oc <yongli-oc@...oxin.com>
>> ---
>> kernel/locking/Makefile | 1 +
>> kernel/locking/numa.h | 98 ++++++++
>> kernel/locking/numa_osq.h | 32 +++
>> kernel/locking/x_osq_lock.c | 332 +++++++++++++++++++++++++++
>> kernel/locking/zx_numa_osq.c | 433 +++++++++++++++++++++++++++++++++++
>> 5 files changed, 896 insertions(+)
>> create mode 100644 kernel/locking/numa.h
>> create mode 100644 kernel/locking/numa_osq.h
>> create mode 100644 kernel/locking/x_osq_lock.c
>> create mode 100644 kernel/locking/zx_numa_osq.c
>
> ...
>
>> + if (lock->numa_enable == OSQLOCKSTOPPING && old ==
>> OSQ_UNLOCKED_VAL)
>> + old = OSQ_LOCKED_VAL;
>> +
>> + for (;;) {
>> + if (READ_ONCE(lock->tail16) == curr &&
>> + cmpxchg(&lock->tail16, curr, old) == curr) {
>
> I would like to ask if there is any benefit to read the location two
> times? cmpxchg() reads the location and skips the update when curr is
> different from the value at the location by itself. Using
> try_cmpxchg() can produce even more optimized code, since on x86 arch
> CMPXCHG also sets ZF flag when operand 2 is equal to the value at
> location (and update happens), and this flag can be used in a
> conditional jump.
The major reason is for doing a read first before cmpxchg() is to avoid
the overhead of an atomic operation in case the current task isn't the
tail. We usually optimize for the case with a lot of incoming lockers
where the chance of a match isn't particularly high.
Cheers,
Longman
Powered by blists - more mailing lists