linux-kernel - Re: semaphore and mutex in current Linux kernel (3.2.2)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120403075245.GD26826@gmail.com>
Date:	Tue, 3 Apr 2012 09:52:45 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	"Chen, Dennis (SRDC SW)" <Dennis1.Chen@....com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"mingo@...hat.com" <mingo@...hat.com>
Subject: Re: semaphore and mutex in current Linux kernel (3.2.2)

* Chen, Dennis (SRDC SW) <Dennis1.Chen@....com> wrote:

> > Well, a way to reproduce that would be to find a lock_mutex
> > intense workload ('perf top -g', etc.), and then changing back
> > the underlying mutex to a semaphore, and measure the performance
> > of the two primitives.
>
> Why not the 'test-mutex' tool in the document? 
>
> I guess it should be the private tool from you, if you can 
> share it to me then I can help to make a new round performance 
> check of the two primitives in the latest kernel... make a 
> deal?

I think I posted it back then - but IIRC it was really just 
open-coded mutex fastpath executed in user-space by a couple of 
threads. To do that today you'd have to create it anew: just 
copy the current mutex fast-path to user-space and measure it.

I'm not sure what the point of comparative measurements with 
semaphores would be: for example we don't have per architecture 
optimized semaphores anymore, we switched the legacy semaphores 
to a generic version and are phasing them out.

Mutexes have various advantages (such as lockdep coverage and in 
general tighter semantics that makes their usage more robust) 
and we aren't going back to semaphores.

What would make a ton of sense would be to create a 'perf bench' 
module that would use the kernel's mutex code and would measure 
it in user-space. 'perf bench mem' already does a simplified 
form of that: it measures the kernel's memcpy and memset 
routines:

$ perf bench mem memcpy -r help
# Running mem/memcpy benchmark...
Unknown routine:help
Available routines...
	default ... Default memcpy() provided by glibc
	x86-64-unrolled ... unrolled memcpy() in arch/x86/lib/memcpy_64.S
	x86-64-movsq ... movsq-based memcpy() in arch/x86/lib/memcpy_64.S
	x86-64-movsb ... movsb-based memcpy() in arch/x86/lib/memcpy_64.S

$ perf bench mem memcpy -r x86-64-movsq
# Running mem/memcpy benchmark...
# Copying 1MB Bytes ...

       2.229595 GB/Sec
      10.850694 GB/Sec (with prefault)

$ perf bench mem memcpy -r x86-64-movsb
# Running mem/memcpy benchmark...
# Copying 1MB Bytes ...

       2.055921 GB/Sec
       2.447525 GB/Sec (with prefault)

So what could be done is to add something like:

  perf bench locking mutex

Which would, similarly to tools/perf/bench/mem-memcpy-x86-64-asm.S
et al inline the mutex routines, would build user-space glue 
around them and would allow them to be measured.

Such a new feature could then be used to improve mutex 
performance in the future. Likewise:

  perf bench locking spinlock

could be used to do something similar to spinlocks - measuring 
them contended and uncontended, cache-cold and cache-hot, etc.

This would then be used by all future kernel developer 
generations to improve these locking primitives - avoiding the 
test-mutex kind of obscolescence and bitrot :-)

So if you'd be interested in writing that brand new benchmarking 
feature and need help then let the perf people know.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/