[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090111201427.GP26290@one.firstfloor.org>
Date: Sun, 11 Jan 2009 21:14:27 +0100
From: Andi Kleen <andi@...stfloor.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andi Kleen <andi@...stfloor.org>,
David Woodhouse <dwmw2@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Harvey Harrison <harvey.harrison@...il.com>,
"H. Peter Anvin" <hpa@...or.com>,
Chris Mason <chris.mason@...cle.com>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
paulmck@...ux.vnet.ibm.com, Gregory Haskins <ghaskins@...ell.com>,
Matthew Wilcox <matthew@....cx>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-btrfs <linux-btrfs@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Nick Piggin <npiggin@...e.de>,
Peter Morreale <pmorreale@...ell.com>,
Sven Dietrich <SDietrich@...ell.com>, jh@...e.cz
Subject: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning
On Sun, Jan 11, 2009 at 11:25:32AM -0800, Linus Torvalds wrote:
>
>
> On Sun, 11 Jan 2009, Andi Kleen wrote:
> >
> > The proposal was to use -fno-inline-functions-called-once (but
> > the resulting numbers were not promising)
>
> Well, the _optimal_ situation would be to not need it, because gcc does a
> good job without it. That implies trying to find a better balance between
> "worth it" and "causes problems".
>
> Rigth now, it does sound like gcc simply doesn't try to balance AT ALL, or
> balances only when we add some very version-specific random options (ie
> the stack-usage one).
The gcc 4.3 inliner takes stack growth into account by default (without
any special options). I experimented a bit with it when that
was introduced and found the default thresholds are too large for the kernel
and don't change the checkstack.pl picture much.
I asked back then and was told --param large-stack-frame
is expected to be a reasonable stable --param (as much as these can
be) and I did a patch to lower it, but I couldn't get myself
to actually submit it [if you really want it I can send it].
But of course that only helps for gcc 4.3+, older gccs would need
a different workaround.
On the other hand (my personal opinion, not shared by everyone) is
that the ioctl switch stack issue is mostly only a problem with 4K
stacks and in the rare cases when I still run 32bit kernels
I never set that option because I consider it russian roulette
(because there undoutedly dangerous dynamic stack growth cases that
checkstack.pl doesn't flag)
> And even those options don't actually make much
> sense - yes, they "balance" things, but they don't do it in a sensible
> manner.
>
> For example: stack usage is undeniably a problem (we've hit it over and
> over again), but it's not about "stack must not be larger than X bytes".
>
> If the call is done unconditionally, then inlining _one_ function will
> grow the static stack usage of the function we inline into, but it will
> _not_ grow the dynamic stack usage one whit - so deciding to not inline
> because of stack usage is pointless.
Don't think the current inliner takes that into account from
a quick look at the sources, although it probably could.
Maybe Honza can comment.
But even if it did it could only do that for a single file,
but if the function is in a different file gcc doesn't
have the information (unless you run with David's --combine hack).
This means the kernel developers have to do it anyways.
On the other hand I'm not sure it's that big a problem. Just someone
should run make checkstack occasionally and add noinlines to large
offenders.
-Andi
[keep quote for Honza's benefit]
>
> See? So "stop inlining when you hit a stack limit" IS THE WRONG THING TO
> DO TOO! Because it just means that the compiler continues to do bad
> inlining decisions until it hits some magical limit - but since the
> problem isn't the static stack size of any _single_ function, but the
> combined stack size of a dynamic chain of them, that's totally idiotic.
> You still grew the dynamic stack, and you have no way of knowing by how
> much - the limit on the static one simply has zero bearing what-so-ever on
> the dynamic one.
>
> So no, "limit static stack usage" is not a good option, because it stops
> inlining when it doesn't matter (single unconditional call), and doesn't
> stop inlining when it might (lots of sequential calls, in a deep chain).
>
> The other alternative is to let gcc do what it does, but
>
> (a) remove lots of unnecessary 'inline's. And we should likely do this
> regardless of any "-fno-inline-functions-called-once" issues.
>
> (b) add lots of 'noinline's to avoid all the cases where gcc screws up so
> badly that it's either a debugging disaster or an actual correctness
> issue.
>
> The problem with (b) is that it's a lot of hard thinking, and debugging
> disasters always happen in code that you didn't realize would be a problem
> (because if you had, it simply wouldn't be the debugging issue it is).
>
> Linus
>
--
ak@...ux.intel.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists