[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0901090759420.6528@localhost.localdomain>
Date: Fri, 9 Jan 2009 08:09:07 -0800 (PST)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Ingo Molnar <mingo@...e.hu>
cc: "H. Peter Anvin" <hpa@...or.com>, Andi Kleen <andi@...stfloor.org>,
Chris Mason <chris.mason@...cle.com>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
paulmck@...ux.vnet.ibm.com, Gregory Haskins <ghaskins@...ell.com>,
Matthew Wilcox <matthew@....cx>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-btrfs <linux-btrfs@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Nick Piggin <npiggin@...e.de>,
Peter Morreale <pmorreale@...ell.com>,
Sven Dietrich <SDietrich@...ell.com>
Subject: Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning
On Fri, 9 Jan 2009, Ingo Molnar wrote:
>
> -static inline int constant_test_bit(int nr, const volatile unsigned long *addr)
> +static __asm_inline int
> +constant_test_bit(int nr, const volatile unsigned long *addr)
> {
> return ((1UL << (nr % BITS_PER_LONG)) &
> (((unsigned long *)addr)[nr / BITS_PER_LONG])) != 0;
Thios makes absolutely no sense.
It's called "__always_inline", not __asm_inline.
Why add a new nonsensical annotations like that?
Also, the very fact that gcc gets that function wrong WHEN 'nr' IS
CONSTANT (which is when it is called) just shows what kind of crap gcc is!
Ingo, the fact is, I care about size, but I care about debuggability and
sanity more. I don't care one _whit_ about 3% size differences, if they
are insane and cause idiotic per-compiler differences.
And you haven't done any interesting analysis per-file etc. It shoul be
almost _trivial_ to do CONFIG_OPTIMIZE_INLINING on/off tests for the whole
tree, and then comparing sizes of individual object files, and see if we
find some obvious _bug_ where we just inline too much.
In fact, we shouldn't even do that - we should try to find a mode where
gcc simply refuses to inline at all, and compare that to one where it
_only_ inlines the things we ask it to. Because that's the more relevant
test. The problem with gcc inlining is actually two-fold:
- gcc doesn't inline things we ask for
Here the sub-problem is that we ask for this too much, but see above on
how to figure -that- out!
- gcc _does_ inline things that we haven't marked at all, causing too
much stack-space to be used, and causing debugging problems.
And here the problem is that gcc should damn well not do that, at least
not as aggressively as it does!
IT DOES NOT MATTER if something is called in just one place and inlining
makes things smaller! If it's not a clear performance win (and it almost
never is, unless the function is really small), the inlining of especially
functions that aren't even hot in the cache is ONLY a negative thing.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists