linux-kernel - Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110324171924.GC2414@elte.hu>
Date:	Thu, 24 Mar 2011 18:19:24 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Jan Beulich <JBeulich@...ell.com>
Cc:	Borislav Petkov <bp@...64.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...nel.dk>,
	"x86@...nel.org" <x86@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>, Jack Steiner <steiner@....com>,
	tee@....com, Nikanth Karthikesan <knikanth@...e.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit_lock
 if possible


* Jan Beulich <JBeulich@...ell.com> wrote:

> >>> On 24.03.11 at 15:52, Borislav Petkov <bp@...64.org> wrote:
> 
> (haven't seen Ingo's original reply, so responding here)
> 
> > On Thu, Mar 24, 2011 at 04:56:47AM -0400, Ingo Molnar wrote:
> >> 
> >> * Nikanth Karthikesan <knikanth@...e.de> wrote:
> >> 
> >> > On x86_64 SMP with lots of CPU atomic instructions which assert the LOCK #
> >> > signal can stall other CPUs. And as the number of cores increase this 
> > penalty
> >> > scales proportionately. So it is best to try and avoid atomic instructions
> >> > wherever possible. test_and_set_bit_lock() can avoid using LOCK_PREFIX if 
> > it
> >> > finds the bit set already.
> >> > 
> >> > Signed-off-by: Nikanth Karthikesan <knikanth@...e.de>
> > 
> > [..]
> > 
> >> > + * test_and_set_bit_lock - Set a bit and return its old value for lock
> >> > + * @nr: Bit to set
> >> > + * @addr: Address to count from
> >> > + *
> >> > + * This is the same as test_and_set_bit on x86. But atomic operation is
> >> > + * avoided, if the bit was already set.
> >> > + */
> >> > +static __always_inline int
> >> > +test_and_set_bit_lock(int nr, volatile unsigned long *addr)
> >> > +{
> >> > +#ifdef CONFIG_SMP
> >> > +	barrier();
> >> > +	if (test_bit(nr, addr))
> >> > +		return 1;
> >> > +#endif
> >> > +	return test_and_set_bit(nr, addr);
> >> > +}
> >>
> >> On modern x86 CPUs there's no "#LOCK signal" anymore - it's replaced
> >> by a M[O]ESI cache coherency bus. I'd expect modern x86 CPUs to be
> >> pretty fast when the cacheline is local and the bit is set already.
> 
> Are you certain? Iirc the lock prefix implies minimally a read-for-
> ownership (if CPUs are really smart enough to optimize away the
> write - I wonder whether that would be correct at all when it
> comes to locked operations), which means a cacheline can still be
> bouncing heavily.

Yeah. On what workload was this?

Generally you use test_and_set_bit() if you expect it to be 'owned' by whoever 
calls it, and released by someone else.

It would be really useful to run perf top on an affected box and see which 
kernel function causes this. It might be better to add a test_bit() to the 
affected codepath - instead of bloating all test_and_set_bit() users.

Note that the patch can also cause overhead: the test_bit() can miss the cache, 
it will bring in the cacheline shared, and the subsequent test_and_set() call 
will then dirty the cacheline - so the CPU might miss again and has to wait for 
other CPUs to first flush this cacheline.

So we really need more details here.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/