[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0901121034170.6528@localhost.localdomain>
Date: Mon, 12 Jan 2009 11:02:17 -0800 (PST)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Bernd Schmidt <bernds_cb1@...nline.de>
cc: Andi Kleen <andi@...stfloor.org>,
David Woodhouse <dwmw2@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Harvey Harrison <harvey.harrison@...il.com>,
"H. Peter Anvin" <hpa@...or.com>,
Chris Mason <chris.mason@...cle.com>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
paulmck@...ux.vnet.ibm.com, Gregory Haskins <ghaskins@...ell.com>,
Matthew Wilcox <matthew@....cx>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-btrfs <linux-btrfs@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Nick Piggin <npiggin@...e.de>,
Peter Morreale <pmorreale@...ell.com>,
Sven Dietrich <SDietrich@...ell.com>, jh@...e.cz
Subject: Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement
adaptive spinning
On Mon, 12 Jan 2009, Bernd Schmidt wrote:
>
> Something at the back of my mind said "aliasing".
>
> $ gcc linus.c -O2 -S ; grep subl linus.s
> subl $1624, %esp
> $ gcc linus.c -O2 -S -fno-strict-aliasing; grep subl linus.s
> subl $824, %esp
>
> That's with 4.3.2.
Interesting.
Nonsensical, but interesting.
Since they have no overlap in lifetime, confusing this with aliasing is
really really broken (if the functions _hadn't_ been inlined, you'd have
gotten the same address for the two variables anyway! So anybody who
thinks that they need different addresses because they are different types
is really really fundmantally confused!).
But your numbers are unambiguous, and I can see the effect of that
compiler flag myself.
The good news is that the kernel obviously already uses
-fno-strict-aliasing for other reasonds, so we should see this effect
already, _despite_ it making no sense. And the stack usage still causes
problems.
Oh, and I see why. This test-case shows it clearly.
Note how the max stack usage _should_ be "struct b" + "struct c". Note how
it isn't (it's "struct a" + "struct b/c").
So what seems to be going on is that gcc is able to do some per-slot
sharing, but if you have one function with a single large entity, and
another with a couple of different ones, gcc can't do any smart
allocation.
Put another way: gcc doesn't create a "union of the set of different stack
usages" (which would be optimal given a single frame, and generate the
stack layout of just the maximum possible size), it creates a "set of
unions of different stack usages" (which can be optimal in the trivial
cases, but not nearly optimal in practical cases).
That explains the ioctl behavior - the structure use is usually pretty
complicated (ie it's almost never about just _one_ large stack slot, but
the ioctl cases tend to do random stuff with multiple slots).
So it doesn't add up to some horrible maximum of all sizes, but it also
doesn't end up coalescing stack usage very well.
Linus
---
struct a {
int a;
unsigned long array[200];
};
struct b {
int b;
unsigned long array[100];
};
struct c {
int c;
unsigned long array[100];
};
extern int fn3(int, void *);
extern int fn4(int, void *);
static inline __attribute__ ((always_inline))
int fn1(int flag)
{
struct a a;
return fn3(flag, &a);
}
static inline __attribute__ ((always_inline))
int fn2(int flag)
{
struct b b;
struct c c;
return fn4(flag, &b) + fn4(flag, &c);
}
int fn(int flag)
{
fn1(flag);
if (flag & 1)
return 0;
return fn2(flag);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists