lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <491D6B4EAD0A714894D8AD22F4BDE043B15911@SCYBEXDAG03.amd.com>
Date:	Mon, 16 Apr 2012 14:10:30 +0000
From:	"Chen, Dennis (SRDC SW)" <Dennis1.Chen@....com>
To:	Ingo Molnar <mingo@...nel.org>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>,
	"peterz@...radead.org" <peterz@...radead.org>,
	Paul Mackerras <paulus@...ba.org>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: RE: [PATCH 0/2] tools perf: Add a new benchmark tool for
 semaphore/mutex

On Mon, Apr 16, 2012 at 5:24 PM, Ingo Molnar <mingo@...nel.org> wrote:
>
> * Chen, Dennis (SRDC SW) <Dennis1.Chen@....com> wrote:
>
>> <PATCH PREFACE>
>> -------------------
>> This patch series are used to add a new performance benchmark tool for semaphore or mutex:
>> The new tool will fork NR tasks specified through the command line and bind each of them
>> to every CPUs in the system equally. The command to launch the tool looks like:
>> '# perf bench locking mutex -p 8 -t 400 -c'
>>
>> The above command will create 400 tasks in a system with 8-CPU, each CPU will have 50 tasks.
>> After the task be created, it will read all the files and directories in '/sys/module'.
>> sysfs is RAM based and its read operation for both dir and file is very sensitive for mutex
>> lock, also '/sys/module' has almost no dependencies on external devices.
>>
>> We can use this tool with 'perf record' command to get the hot-spot of the codes or
>> 'perf top -g' to get live info, for example, below is a test case run in a intel i7-2600 box
>> (-c option is to get the cpu cycles, I don't use it in this test case):
>>
>> # perf record -a perf bench locking mutex -p 8 -t 4000
>> # Running locking/mutex benchmark...
>>  ...
>>  [13894 ]/6  duration        23 s   609392 us
>>  [13996 ]/4  duration        23 s   599418 us
>>  [14056 ]/0  duration        23 s   595710 us
>>  [13715 ]/3  duration        23 s   621719 us
>>  [13390 ]/6  duration        23 s   644020 us
>>  [13696 ]/0  duration        23 s   623101 us
>>  [14334 ]/6  duration        23 s   580262 us
>>  [14343 ]/7  duration        23 s   578702 us
>>  [14283 ]/3  duration        23 s   583007 us
>>  -----------------------------------
>>  Total duration     79353 s   943945 us
>>
>>  real: 23.84   s
>>  user: 0.00
>>  sys:  0.45
>>
>> # perf report
>> ===================================================================================
>> ...
>> # perf version : 3.3.2
>> # arch : x86_64
>> # nrcpus online : 8
>> # nrcpus avail : 8
>> # cpudesc : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
>> # total memory : 3966460 kB
>> # cmdline : /usr/bin/perf record -a perf bench locking mutex -p 8 -t 4000
>>
>> # Events: 131K cycles
>> #
>> # Overhead          Command                      Shared Object                                 Symbol
>> # ........  ...............  .................................  .....................................
>> #
>>     22.12%           perf  [kernel.kallsyms]                  [k] __mutex_lock_slowpath
>>      8.27%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock
>>      6.16%           perf  [kernel.kallsyms]                  [k] mutex_unlock
>>      5.22%           perf  [kernel.kallsyms]                  [k] mutex_spin_on_owner
>>      4.94%           perf  [kernel.kallsyms]                  [k] sysfs_refresh_inode
>>      4.82%           perf  [kernel.kallsyms]                  [k] mutex_lock
>>      2.67%           perf  [kernel.kallsyms]                  [k] __mutex_unlock_slowpath
>>      2.61%           perf  [kernel.kallsyms]                  [k] link_path_walk
>>      2.42%           perf  [kernel.kallsyms]                  [k] _raw_spin_lock_irqsave
>>      1.61%           perf  [kernel.kallsyms]                  [k] __d_lookup
>>      1.18%           perf  [kernel.kallsyms]                  [k] clear_page_c
>>      1.16%           perf  [kernel.kallsyms]                  [k] dput
>>      0.97%           perf  [kernel.kallsyms]                  [k] do_lookup
>>      0.93%        swapper  [kernel.kallsyms]                  [k] intel_idle
>>      0.87%           perf  [kernel.kallsyms]                  [k] get_page_from_freelist
>>      0.85%           perf  [kernel.kallsyms]                  [k] __strncpy_from_user
>>      0.81%           perf  [kernel.kallsyms]                  [k] system_call
>>      0.78%           perf  libc-2.13.so                       [.] 0x84ef0
>>      0.71%           perf  [kernel.kallsyms]                  [k] vfsmount_lock_local_lock
>>      0.68%           perf  [kernel.kallsyms]                  [k] sysfs_dentry_revalidate
>>      0.62%           perf  [kernel.kallsyms]                  [k] try_to_wake_up
>>      0.62%           perf  [kernel.kallsyms]                  [k] kfree
>>      0.60%           perf  [kernel.kallsyms]                  [k] kmem_cache_alloc
>> ............................................................................................
>>
>
> Nice! Would be nice to lift some of this information over into
> the changelogs, to address my complaints in the previous mail.

Thanks for the suggestion! I will resubmit the patches into a single patch and include the above info
to address the changelog issue...

>> We can see that for 4000 tasks running in 8 CPUs simultaneously, it will create a very heavy
>> contention for the mutex lock, so lot's of tasks enter into the slow path of the mutex lock...
>> I am very curious if we switch the mutex to the semaphore in this case, how's thing going?
>> My next plan
>
> Seems like an unfinished sentence.

Oh, I mean my next plan is to do some performance analysis of the 2 primitives with this tool...

> Thanks,
>
>        Ingo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ