lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <491D6B4EAD0A714894D8AD22F4BDE043B15DCF@SCYBEXDAG03.amd.com>
Date:	Tue, 17 Apr 2012 09:36:09 +0000
From:	"Chen, Dennis (SRDC SW)" <Dennis1.Chen@....com>
To:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC:	Ingo Molnar <mingo@...nel.org>,
	"paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>,
	"peterz@...radead.org" <peterz@...radead.org>,
	Paul Mackerras <paulus@...ba.org>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: A quick view of the performance benchmark for semaphore-like and
 mutex

Just as a quick & rough test, with below changes based on mutex (almost the same as semaphore):

--- /home/dennis/Linux/linux-3.3.2-sem/kernel/mutex.c   2012-04-17 14:59:49.823177615 +0800
+++ ./mutex.c   2012-04-17 17:00:12.963059284 +0800
@@ -140,6 +140,7 @@ __mutex_lock_common(struct mutex *lock,
        preempt_disable();
        mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
 
+#if 0
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
        /*
         * Optimistic spinning.
@@ -195,6 +196,7 @@ __mutex_lock_common(struct mutex *lock,
                arch_mutex_cpu_relax();
        }
 #endif
+#endif
        spin_lock_mutex(&lock->wait_lock, flags);
 
        debug_mutex_lock_common(lock, &waiter);


#perf record -a perf bench locking mutex -p 8 -t 3000

The benchmark result BEFORE (mutex)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
round 1:
Total duration     39868 s   536095 us

real: 15.89   s
user: 0.00   
sys:  0.31  

Events: 64K cycles
 20.18%           perf  [kernel.kallsyms]                  [k] __mutex_lock_slowpath                                                      
  8.41%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                              
  8.00%           perf  [kernel.kallsyms]                  [k] mutex_unlock                                                               
  5.29%           perf  [kernel.kallsyms]                  [k] mutex_lock                                                                  
  2.88%           perf  [kernel.kallsyms]                  [k] link_path_walk                                                              
  2.56%           perf  [kernel.kallsyms]                  [k] __mutex_unlock_slowpath                                                     
  2.31%           perf  [kernel.kallsyms]                  [k] mutex_spin_on_owner                                                         
  2.29%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                      
  1.68%           perf  [kernel.kallsyms]                  [k] __d_lookup                                                                  
  1.33%           perf  [kernel.kallsyms]                  [k] dput                                                                        
  1.33%           perf  [kernel.kallsyms]                  [k] clear_page_c                                                               
  1.06%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                         
  1.04%           perf  [kernel.kallsyms]                  [k] do_lookup                        
  ...
-------------------------------------------------------------------------------------
round 2:
Total duration     39748 s   176410 us

real: 15.92   s
user: 0.00   
sys:  0.32

Events: 63K cycles
 19.68%           perf  [kernel.kallsyms]                  [k] __mutex_lock_slowpath                                                      
  8.53%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                              
  7.74%           perf  [kernel.kallsyms]                  [k] mutex_unlock                                                               
  5.09%           perf  [kernel.kallsyms]                  [k] mutex_lock                                                                  
  3.06%           perf  [kernel.kallsyms]                  [k] link_path_walk                                                              
  2.54%           perf  [kernel.kallsyms]                  [k] __mutex_unlock_slowpath                                                     
  2.31%           perf  [kernel.kallsyms]                  [k] mutex_spin_on_owner                                                         
  2.30%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                      
  1.76%           perf  [kernel.kallsyms]                  [k] __d_lookup                                                                  
  1.46%           perf  [kernel.kallsyms]                  [k] clear_page_c                                                               
  1.31%           perf  [kernel.kallsyms]                  [k] dput                                                                        
  1.10%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                         
  1.08%           perf  [kernel.kallsyms]                  [k] do_lookup  
  ...
-------------------------------------------------------------------------------------
round 3:
Total duration     40047 s   394364 us

real: 15.59   s
user: 0.00   
sys:  0.30   

Events: 58K cycles
 19.18%           perf  [kernel.kallsyms]                  [k] __mutex_lock_slowpath                                                      
  8.68%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                              
  7.80%           perf  [kernel.kallsyms]                  [k] mutex_unlock                                                               
  5.24%           perf  [kernel.kallsyms]                  [k] mutex_lock                                                                  
  3.22%           perf  [kernel.kallsyms]                  [k] link_path_walk                                                              
  2.57%           perf  [kernel.kallsyms]                  [k] __mutex_unlock_slowpath                                                     
  2.38%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                      
  2.13%           perf  [kernel.kallsyms]                  [k] mutex_spin_on_owner                                                         
  1.79%           perf  [kernel.kallsyms]                  [k] __d_lookup                                                                  
  1.54%           perf  [kernel.kallsyms]                  [k] clear_page_c                                                               
  1.34%           perf  [kernel.kallsyms]                  [k] dput                                                                        
  1.12%           perf  [kernel.kallsyms]                  [k] do_lookup                                                                  
  1.04%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                         
  1.02%           perf  [kernel.kallsyms]                  [k] system_call                                                                 
  1.02%           perf  [kernel.kallsyms]                  [k] get_page_from_freelist
  ...

The benchmark result AFTER (remove the optimization part of mutex)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
round 1:
Total duration     66319 s   868892 us

 real: 23.16   s
 user: 0.00   
 sys:  0.29  

Events: 81K cycles
  6.30%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                              
  3.13%           perf  [kernel.kallsyms]                  [k] mutex_unlock                                                                
  3.09%           perf  [kernel.kallsyms]                  [k] mutex_lock                                                                  
  3.07%           perf  [kernel.kallsyms]                  [k] link_path_walk                                                              
  2.66%        swapper  [kernel.kallsyms]                  [k] intel_idle                                                                  
  2.21%           perf  [kernel.kallsyms]                  [k] __d_lookup                                                                  
  1.80%           perf  [kernel.kallsyms]                  [k] clear_page_c                                                                
  1.58%           perf  [kernel.kallsyms]                  [k] system_call                                                                 
  1.56%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                         
  1.53%           perf  [kernel.kallsyms]                  [k] do_lookup                                                                   
  1.47%           perf  [kernel.kallsyms]                  [k] dput                                                                        
  1.43%           perf  [kernel.kallsyms]                  [k] get_page_from_freelist                                                      
  1.28%           perf  libc-2.13.so                       [.] 0xa99f6                                                                     
  1.19%        swapper  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                      
  1.15%           perf  [kernel.kallsyms]                  [k] vfsmount_lock_local_lock                                                    
  1.12%           perf  [kernel.kallsyms]                  [k] kfree        
  ...   
-------------------------------------------------------------------------------------
round 2:
Total duration     67448 s   392232 us

 real: 23.21   s
 user: 0.00   
 sys:  0.29

Events: 82K cycles
  6.23%             perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                            
  3.23%             perf  [kernel.kallsyms]                  [k] mutex_unlock                                                              
  3.10%             perf  [kernel.kallsyms]                  [k] mutex_lock                                                                
  3.10%             perf  [kernel.kallsyms]                  [k] link_path_walk                                                            
  2.59%          swapper  [kernel.kallsyms]                  [k] intel_idle                                                                
  2.18%             perf  [kernel.kallsyms]                  [k] __d_lookup                                                                
  1.88%             perf  [kernel.kallsyms]                  [k] clear_page_c                                                              
  1.60%             perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                       
  1.50%             perf  [kernel.kallsyms]                  [k] system_call                                                               
  1.48%             perf  [kernel.kallsyms]                  [k] dput                                                                      
  1.44%             perf  [kernel.kallsyms]                  [k] do_lookup                                                                 
  1.33%             perf  [kernel.kallsyms]                  [k] get_page_from_freelist                                                    
  1.29%             perf  libc-2.13.so                       [.] 0x82715                                                                   
  1.19%          swapper  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                    
  1.11%             perf  [kernel.kallsyms]                  [k] kfree                                                                     
  1.10%             perf  [kernel.kallsyms]                  [k] vfsmount_lock_local_lock                                                  
  1.01%             perf  [kernel.kallsyms]                  [k] __alloc_pages_nodemask
  ...
-------------------------------------------------------------------------------------
round 3:
Total duration     66468 s   532417 us

 real: 23.35   s
 user: 0.00   
 sys:  0.28
Events: 87K cycles
  6.30%             perf  [kernel.kallsyms]                  [k] _raw_spin_lock                                                            
  3.09%             perf  [kernel.kallsyms]                  [k] mutex_unlock                                                              
  2.98%             perf  [kernel.kallsyms]                  [k] link_path_walk                                                            
  2.98%             perf  [kernel.kallsyms]                  [k] mutex_lock                                                                
  2.70%          swapper  [kernel.kallsyms]                  [k] intel_idle                                                                
  2.25%             perf  [kernel.kallsyms]                  [k] __d_lookup                                                                
  1.92%             perf  [kernel.kallsyms]                  [k] clear_page_c                                                              
  1.56%             perf  [kernel.kallsyms]                  [k] __strncpy_from_user                                                       
  1.47%             perf  [kernel.kallsyms]                  [k] dput                                                                      
  1.47%             perf  [kernel.kallsyms]                  [k] system_call                                                               
  1.42%             perf  [kernel.kallsyms]                  [k] do_lookup                                                                 
  1.35%             perf  [kernel.kallsyms]                  [k] get_page_from_freelist                                                    
  1.32%             perf  libc-2.13.so                       [.] 0x12902e                                                                  
  1.32%          swapper  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave                                                    
  1.10%             perf  [kernel.kallsyms]                  [k] vfsmount_lock_local_lock                                                  
  1.02%             perf  [kernel.kallsyms]                  [k] kfree                                                                     
  1.00%             perf  [kernel.kallsyms]                  [k] __alloc_pages_nodemask

Interesting!! Semaphore-like is almost 8s slower than mutex... Also, the Events sycles of perf
reported is different



 


                                                                

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ