linux-kernel - Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090121095418.GG15750@one.firstfloor.org>
Date:	Wed, 21 Jan 2009 10:54:18 +0100
From:	Andi Kleen <andi@...stfloor.org>
To:	Nick Piggin <npiggin@...e.de>
Cc:	Andi Kleen <andi@...stfloor.org>, Ingo Molnar <mingo@...e.hu>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	David Woodhouse <dwmw2@...radead.org>,
	Bernd Schmidt <bernds_cb1@...nline.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Harvey Harrison <harvey.harrison@...il.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Chris Mason <chris.mason@...cle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	paulmck@...ux.vnet.ibm.com, Gregory Haskins <ghaskins@...ell.com>,
	Matthew Wilcox <matthew@....cx>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-btrfs <linux-btrfs@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Morreale <pmorreale@...ell.com>,
	Sven Dietrich <SDietrich@...ell.com>, jh@...e.cz
Subject: Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning

> The point is that the compiler is then free to do it. If things
> slow down after the compiler gets *more* information, then that
> is a problem with the compiler heuristics rather than the
> information we give it.

The point was the -Os disables it typically then.
(not always, compiler heuristics are far from perfect)

> 
>  
> > Then x86s tend to have very very fast L1 caches and
> > if something is not in L1 on reads then the cost of fetching
> > something for a read dwarfs the few cycles you can typically
> > get out of this.
> 
> Well most architectures have L1 caches of several cycles. And
> L2 miss typically means going to L2 which in some cases the
> compiler is expected to attempt to cover as much as possible
> (eg in-order architectures).

L2 cache is so much slower that scheduling a few instructions
more doesn't help much.

> stall, so you still want to get loads out early if possible.
> 
> Even a lot of OOOE CPUs I think won't have the best alias
> anaysis, so all else being equal, it wouldn't hurt them to
> move loads earlier.

Hmm, but if the load is nearby it won't matter if a 
store is in the middle, because the CPU will just execute
over it.

The real big win is if you do some computation inbetween,
but at least for typical list manipulation there isn't 
really any.

> > Also at least x86 gcc normally doesn't do scheduling 
> > beyond basic blocks, so any if () shuts it up.
> 
> I don't think any of this is a reason not to use restrict, though.
> But... there are so many places we could add it to the kernel, and
> probably so few where it makes much difference. Maybe it should be
> able to help some critical core code, though.

Frankly I think it would be another unlikely().

-Andi

-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/