linux-kernel - Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110325164944.GA19854@sgi.com>
Date:	Fri, 25 Mar 2011 11:49:44 -0500
From:	Jack Steiner <steiner@....com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Jan Beulich <JBeulich@...ell.com>, Ingo Molnar <mingo@...e.hu>,
	Borislav Petkov <bp@...64.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...nel.dk>,
	"x86@...nel.org" <x86@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Ingo Molnar <mingo@...hat.com>, tee@....com,
	Nikanth Karthikesan <knikanth@...e.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH RFC] x86: avoid atomic operation in
	test_and_set_bit_lock if possible

On Fri, Mar 25, 2011 at 09:29:34AM -0700, Linus Torvalds wrote:
> On Fri, Mar 25, 2011 at 3:06 AM, Jan Beulich <JBeulich@...ell.com> wrote:
> >
> > The problem was observed with __lock_page() (in a variant not
> > upstream for reasons not known to me), and prefixing e.g.
> > trylock_page() with an extra PageLocked() check yielded the
> > below quoted improvements.
> 
> Ok. __lock_page() _definitely_ should do the test_bit() thing first,
> because it's normally called from lock_page() that has already tested
> the bit.
> 
> But it already seems to do that, so I'm wondering what your variant is.
> 
> I'm also a bit surprised that lock_page() is that hot (unless your
> _lock_page() variant is simply too broken and ends up spinning?).
> Maybe we have some path that takes the page lock unnecessarily? What's
> the load?

We see the problem primarily on launching very large MPI applications.
The master process rapidly forks a large number (1 per cpu) of processes.
Each faults in a large number of text pages.

The text pages are resident in the page cache. No IO is involved but
the page lock quickly becomes a very hot contended cacheline.

Note also that this is observed in a 2.6.32 distro kernel that has a
different implementation of __lock_page. I think a similar problem
exists in the upstream kernel but have not had a chance to investigate.

We also see a similar problem during boot when a large number of udevd
processes are created.


--- jack

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/