lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7147ea4e429326e76723fd788e44b6f4@linux.ibm.com>
Date: Wed, 30 Jul 2025 16:16:36 +0530
From: samir <samir@...ux.ibm.com>
To: "Nysal Jan K.A." <nysal@...ux.ibm.com>
Cc: Madhavan Srinivasan <maddy@...ux.ibm.com>,
        Michael Ellerman
 <mpe@...erman.id.au>,
        Nicholas Piggin <npiggin@...il.com>,
        Christophe Leroy
 <christophe.leroy@...roup.eu>,
        linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] powerpc/qspinlock: Add spinlock contention tracepoint

On 2025-07-25 13:44, Nysal Jan K.A. wrote:
> Add a lock contention tracepoint in the queued spinlock slowpath.
> Also add the __lockfunc annotation so that in_lock_functions()
> works as expected.
> 
> Signed-off-by: Nysal Jan K.A. <nysal@...ux.ibm.com>
> ---
>  arch/powerpc/lib/qspinlock.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/lib/qspinlock.c 
> b/arch/powerpc/lib/qspinlock.c
> index bcc7e4dff8c3..622e7f45c2ce 100644
> --- a/arch/powerpc/lib/qspinlock.c
> +++ b/arch/powerpc/lib/qspinlock.c
> @@ -9,6 +9,7 @@
>  #include <linux/sched/clock.h>
>  #include <asm/qspinlock.h>
>  #include <asm/paravirt.h>
> +#include <trace/events/lock.h>
> 
>  #define MAX_NODES	4
> 
> @@ -708,8 +709,9 @@ static __always_inline void
> queued_spin_lock_mcs_queue(struct qspinlock *lock, b
>  	qnodesp->count--;
>  }
> 
> -void queued_spin_lock_slowpath(struct qspinlock *lock)
> +void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock)
>  {
> +	trace_contention_begin(lock, LCB_F_SPIN);
>  	/*
>  	 * This looks funny, but it induces the compiler to inline both
>  	 * sides of the branch rather than share code as when the condition
> @@ -718,16 +720,17 @@ void queued_spin_lock_slowpath(struct qspinlock 
> *lock)
>  	if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) && is_shared_processor()) {
>  		if (try_to_steal_lock(lock, true)) {
>  			spec_barrier();
> -			return;
> +		} else {
> +			queued_spin_lock_mcs_queue(lock, true);
>  		}
> -		queued_spin_lock_mcs_queue(lock, true);
>  	} else {
>  		if (try_to_steal_lock(lock, false)) {
>  			spec_barrier();
> -			return;
> +		} else {
> +			queued_spin_lock_mcs_queue(lock, false);
>  		}
> -		queued_spin_lock_mcs_queue(lock, false);
>  	}
> +	trace_contention_end(lock, 0);
>  }
>  EXPORT_SYMBOL(queued_spin_lock_slowpath);

Hello,

I have verified the patch with the latest upstream Linux kernel, and 
here are my findings:

———Kernel Version———
6.16.0-rc7-160000.11-default+
———perf --version———
perf version 6.16.rc7.g5f33ebd2018c

To test this patch, I used the Lockstorm benchmark, which rigorously 
exercises spinlocks from kernel space.

Benchmark repository: https://github.com/lop-devops/lockstorm
To capture all events related to the Lockstorm benchmark, I used the 
following command:
cmd: perf lock record -a insmod lockstorm.ko
After generating the perf.data, I analyzed the results using:
cmd: perf lock contention -a -i perf.data

————Logs————
contended   total wait     max wait     avg wait         type   caller

    6187241     12.50 m       2.30 ms    121.22 us     spinlock   
kthread+0x160
         78      8.23 ms    209.87 us    105.47 us     rwlock:W   
do_exit+0x378
         71      7.97 ms    208.07 us    112.24 us     spinlock   
do_exit+0x378
         68      4.18 ms    210.04 us     61.43 us     rwlock:W   
release_task+0xe0
         63      3.96 ms    204.02 us     62.90 us     spinlock   
release_task+0xe0
        115    477.15 us     19.69 us      4.15 us     spinlock   
rcu_report_qs_rdp+0x40
        250    437.34 us      5.34 us      1.75 us     spinlock   
raw_spin_rq_lock_nested+0x24
         32    156.32 us     13.56 us      4.88 us     spinlock   
cgroup_exit+0x34
         19     88.12 us     12.20 us      4.64 us     spinlock   
exit_fs+0x44
         12     23.25 us      3.09 us      1.94 us     spinlock   
lock_hrtimer_base+0x4c
          1     18.83 us     18.83 us     18.83 us      rwsem:R   
btrfs_tree_read_lock_nested+0x38
          1     17.84 us     17.84 us     17.84 us      rwsem:W   
btrfs_tree_lock_nested+0x38
         10     15.75 us      5.72 us      1.58 us     spinlock   
raw_spin_rq_lock_nested+0x24
          5     15.08 us      5.59 us      3.02 us     spinlock   
mix_interrupt_randomness+0xb4
          2     12.78 us      9.50 us      4.26 us     spinlock   
raw_spin_rq_lock_nested+0x24
          1     11.13 us     11.13 us     11.13 us     spinlock   
__queue_work+0x338
          3     10.79 us      7.04 us      3.60 us     spinlock   
raw_spin_rq_lock_nested+0x24
          3      8.17 us      4.58 us      2.72 us     spinlock   
raw_spin_rq_lock_nested+0x24
          3      7.99 us      3.13 us      2.66 us     spinlock   
lock_hrtimer_base+0x4c
          2      6.66 us      4.57 us      3.33 us     spinlock   
free_pcppages_bulk+0x50
          3      5.34 us      2.19 us      1.78 us     spinlock   
ibmvscsi_handle_crq+0x1e4
          2      3.71 us      2.32 us      1.85 us     spinlock   
__hrtimer_run_queues+0x1b8
          2      2.98 us      2.19 us      1.49 us     spinlock   
raw_spin_rq_lock_nested+0x24
          1      2.85 us      2.85 us      2.85 us     spinlock   
raw_spin_rq_lock_nested+0x24
          2      2.15 us      1.09 us      1.07 us     spinlock   
raw_spin_rq_lock_nested+0x24
          2      2.06 us      1.06 us      1.03 us     spinlock   
raw_spin_rq_lock_nested+0x24
          1      1.69 us      1.69 us      1.69 us     spinlock   
raw_spin_rq_lock_nested+0x24
          1      1.53 us      1.53 us      1.53 us     spinlock   
__queue_work+0xd8
          1      1.27 us      1.27 us      1.27 us     spinlock   
pull_rt_task+0xa0
          1      1.16 us      1.16 us      1.16 us     spinlock   
raw_spin_rq_lock_nested+0x24
          1       740 ns       740 ns       740 ns     spinlock   
add_device_randomness+0x5c
          1       566 ns       566 ns       566 ns     spinlock   
raw_spin_rq_lock_nested+0x24

 From the results, we were able to observe lock contention specifically 
on spinlocks.

The patch works as expected.
Thank you for the patch!

Tested-by: Samir Mulani <samir@...ux.ibm.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ