[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7147ea4e429326e76723fd788e44b6f4@linux.ibm.com>
Date: Wed, 30 Jul 2025 16:16:36 +0530
From: samir <samir@...ux.ibm.com>
To: "Nysal Jan K.A." <nysal@...ux.ibm.com>
Cc: Madhavan Srinivasan <maddy@...ux.ibm.com>,
Michael Ellerman
<mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy
<christophe.leroy@...roup.eu>,
linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] powerpc/qspinlock: Add spinlock contention tracepoint
On 2025-07-25 13:44, Nysal Jan K.A. wrote:
> Add a lock contention tracepoint in the queued spinlock slowpath.
> Also add the __lockfunc annotation so that in_lock_functions()
> works as expected.
>
> Signed-off-by: Nysal Jan K.A. <nysal@...ux.ibm.com>
> ---
> arch/powerpc/lib/qspinlock.c | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/lib/qspinlock.c
> b/arch/powerpc/lib/qspinlock.c
> index bcc7e4dff8c3..622e7f45c2ce 100644
> --- a/arch/powerpc/lib/qspinlock.c
> +++ b/arch/powerpc/lib/qspinlock.c
> @@ -9,6 +9,7 @@
> #include <linux/sched/clock.h>
> #include <asm/qspinlock.h>
> #include <asm/paravirt.h>
> +#include <trace/events/lock.h>
>
> #define MAX_NODES 4
>
> @@ -708,8 +709,9 @@ static __always_inline void
> queued_spin_lock_mcs_queue(struct qspinlock *lock, b
> qnodesp->count--;
> }
>
> -void queued_spin_lock_slowpath(struct qspinlock *lock)
> +void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock)
> {
> + trace_contention_begin(lock, LCB_F_SPIN);
> /*
> * This looks funny, but it induces the compiler to inline both
> * sides of the branch rather than share code as when the condition
> @@ -718,16 +720,17 @@ void queued_spin_lock_slowpath(struct qspinlock
> *lock)
> if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) && is_shared_processor()) {
> if (try_to_steal_lock(lock, true)) {
> spec_barrier();
> - return;
> + } else {
> + queued_spin_lock_mcs_queue(lock, true);
> }
> - queued_spin_lock_mcs_queue(lock, true);
> } else {
> if (try_to_steal_lock(lock, false)) {
> spec_barrier();
> - return;
> + } else {
> + queued_spin_lock_mcs_queue(lock, false);
> }
> - queued_spin_lock_mcs_queue(lock, false);
> }
> + trace_contention_end(lock, 0);
> }
> EXPORT_SYMBOL(queued_spin_lock_slowpath);
Hello,
I have verified the patch with the latest upstream Linux kernel, and
here are my findings:
———Kernel Version———
6.16.0-rc7-160000.11-default+
———perf --version———
perf version 6.16.rc7.g5f33ebd2018c
To test this patch, I used the Lockstorm benchmark, which rigorously
exercises spinlocks from kernel space.
Benchmark repository: https://github.com/lop-devops/lockstorm
To capture all events related to the Lockstorm benchmark, I used the
following command:
cmd: perf lock record -a insmod lockstorm.ko
After generating the perf.data, I analyzed the results using:
cmd: perf lock contention -a -i perf.data
————Logs————
contended total wait max wait avg wait type caller
6187241 12.50 m 2.30 ms 121.22 us spinlock
kthread+0x160
78 8.23 ms 209.87 us 105.47 us rwlock:W
do_exit+0x378
71 7.97 ms 208.07 us 112.24 us spinlock
do_exit+0x378
68 4.18 ms 210.04 us 61.43 us rwlock:W
release_task+0xe0
63 3.96 ms 204.02 us 62.90 us spinlock
release_task+0xe0
115 477.15 us 19.69 us 4.15 us spinlock
rcu_report_qs_rdp+0x40
250 437.34 us 5.34 us 1.75 us spinlock
raw_spin_rq_lock_nested+0x24
32 156.32 us 13.56 us 4.88 us spinlock
cgroup_exit+0x34
19 88.12 us 12.20 us 4.64 us spinlock
exit_fs+0x44
12 23.25 us 3.09 us 1.94 us spinlock
lock_hrtimer_base+0x4c
1 18.83 us 18.83 us 18.83 us rwsem:R
btrfs_tree_read_lock_nested+0x38
1 17.84 us 17.84 us 17.84 us rwsem:W
btrfs_tree_lock_nested+0x38
10 15.75 us 5.72 us 1.58 us spinlock
raw_spin_rq_lock_nested+0x24
5 15.08 us 5.59 us 3.02 us spinlock
mix_interrupt_randomness+0xb4
2 12.78 us 9.50 us 4.26 us spinlock
raw_spin_rq_lock_nested+0x24
1 11.13 us 11.13 us 11.13 us spinlock
__queue_work+0x338
3 10.79 us 7.04 us 3.60 us spinlock
raw_spin_rq_lock_nested+0x24
3 8.17 us 4.58 us 2.72 us spinlock
raw_spin_rq_lock_nested+0x24
3 7.99 us 3.13 us 2.66 us spinlock
lock_hrtimer_base+0x4c
2 6.66 us 4.57 us 3.33 us spinlock
free_pcppages_bulk+0x50
3 5.34 us 2.19 us 1.78 us spinlock
ibmvscsi_handle_crq+0x1e4
2 3.71 us 2.32 us 1.85 us spinlock
__hrtimer_run_queues+0x1b8
2 2.98 us 2.19 us 1.49 us spinlock
raw_spin_rq_lock_nested+0x24
1 2.85 us 2.85 us 2.85 us spinlock
raw_spin_rq_lock_nested+0x24
2 2.15 us 1.09 us 1.07 us spinlock
raw_spin_rq_lock_nested+0x24
2 2.06 us 1.06 us 1.03 us spinlock
raw_spin_rq_lock_nested+0x24
1 1.69 us 1.69 us 1.69 us spinlock
raw_spin_rq_lock_nested+0x24
1 1.53 us 1.53 us 1.53 us spinlock
__queue_work+0xd8
1 1.27 us 1.27 us 1.27 us spinlock
pull_rt_task+0xa0
1 1.16 us 1.16 us 1.16 us spinlock
raw_spin_rq_lock_nested+0x24
1 740 ns 740 ns 740 ns spinlock
add_device_randomness+0x5c
1 566 ns 566 ns 566 ns spinlock
raw_spin_rq_lock_nested+0x24
From the results, we were able to observe lock contention specifically
on spinlocks.
The patch works as expected.
Thank you for the patch!
Tested-by: Samir Mulani <samir@...ux.ibm.com>
Powered by blists - more mailing lists