linux-kernel - Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <201103251734.55239.knikanth@suse.de>
Date:	Fri, 25 Mar 2011 17:34:54 +0530
From:	Nikanth Karthikesan <knikanth@...e.de>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Jan Beulich <JBeulich@...ell.com>, Jack Steiner <steiner@....com>,
	Borislav Petkov <bp@...64.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...nel.dk>,
	"x86@...nel.org" <x86@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Ingo Molnar <mingo@...hat.com>, tee@....com,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit_lock if possible

On Friday, March 25, 2011 04:40:13 pm Ingo Molnar wrote:
> * Jan Beulich <JBeulich@...ell.com> wrote:
> > >>> On 24.03.11 at 18:19, Ingo Molnar <mingo@...e.hu> wrote:
> > > * Jan Beulich <JBeulich@...ell.com> wrote:
> > >> Are you certain? Iirc the lock prefix implies minimally a read-for-
> > >> ownership (if CPUs are really smart enough to optimize away the
> > >> write - I wonder whether that would be correct at all when it
> > >> comes to locked operations), which means a cacheline can still be
> > >> bouncing heavily.
> > > 
> > > Yeah. On what workload was this?
> > > 
> > > Generally you use test_and_set_bit() if you expect it to be 'owned' by
> > > whoever calls it, and released by someone else.
> > > 
> > > It would be really useful to run perf top on an affected box and see
> > > which kernel function causes this. It might be better to add a
> > > test_bit() to the affected codepath - instead of bloating all
> > > test_and_set_bit() users.
> > 
> > Indeed, I agree with you and Linus in this aspect.
> > 
> > > Note that the patch can also cause overhead: the test_bit() can miss
> > > the cache, it will bring in the cacheline shared, and the subsequent
> > > test_and_set() call will then dirty the cacheline - so the CPU might
> > > miss again and has to wait for other CPUs to first flush this
> > > cacheline.
> > > 
> > > So we really need more details here.
> > 
> > The problem was observed with __lock_page() (in a variant not
> > upstream for reasons not known to me), and prefixing e.g.
> > trylock_page() with an extra PageLocked() check yielded the
> > below quoted improvements.
> 
> The page lock flag is indeed one of those (rather rare) exceptions to
> typical object locking patterns. So in that particular case adding the
> PageLocked() test to trylock_page() would be the right approach to
> improving performance.
> 
> In the common case this change actively hurts for various reasons:
> 
>  - can turn a cache miss into two cache misses
>  - adds an often unnecessary branch instruction
>  - adds often unnecessary bloat
>  - leaks a barrier
> 

Yes, I think I am observing these ill-effects when testing the code copied to 
user-space.

Thanks
Nikanth
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/