lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0901111458290.6528@localhost.localdomain>
Date:	Sun, 11 Jan 2009 15:05:53 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Andi Kleen <andi@...stfloor.org>
cc:	David Woodhouse <dwmw2@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Harvey Harrison <harvey.harrison@...il.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Chris Mason <chris.mason@...cle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	paulmck@...ux.vnet.ibm.com, Gregory Haskins <ghaskins@...ell.com>,
	Matthew Wilcox <matthew@....cx>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-btrfs <linux-btrfs@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Nick Piggin <npiggin@...e.de>,
	Peter Morreale <pmorreale@...ell.com>,
	Sven Dietrich <SDietrich@...ell.com>, jh@...e.cz
Subject: Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement
 adaptive spinning



On Sun, 11 Jan 2009, Linus Torvalds wrote:
> On Sun, 11 Jan 2009, Andi Kleen wrote:
> > 
> > Was -- i think that got fixed in gcc. But again only in newer versions.
> 
> I doubt it. People have said that about a million times, it has never 
> gotten fixed, and I've never seen any actual proof.

In fact, I just double-checked.

Try this:

	struct a {
		unsigned long array[200];
		int a;
	};

	struct b {
		int b;
		unsigned long array[200];
	};

	extern int fn3(int, void *);
	extern int fn4(int, void *);

	static inline __attribute__((always_inline)) int fn1(int flag)
	{
		struct a a;
		return fn3(flag, &a);
	}

	static inline __attribute__((always_inline)) int fn2(int flag)
	{
		struct b b;
		return fn4(flag, &b);
	}

	int fn(int flag)
	{
		if (flag & 1)
			return fn1(flag);
		return fn2(flag);
	}

(yeah, I made sure it would inline with "always_inline" just so that the 
issue wouldn't be hidden by any "avoid stack frames" flags).

Gcc creates a big stack frame that contains _both_ 'a' and 'b', and does 
not merge the allocations together even though they clearly have no 
overlap in usage. Both 'a' and 'b' get 201 long-words (1608 bytes) of 
stack, causing the inlined version to have 3kB+ of stack, even though the 
non-inlined one would never use more than half of it.

So please stop claiming this is fixed. It's not fixed, never has been, and 
quite frankly, probably never will be because the lifetime analysis is 
hard enough (ie once you inline and there is any complex usage, CSE etc 
will quite possibly mix up the lifetimes - the above is clearly not any 
_realistic_ example).

So even if the above trivial case could be fixed, I suspect a more complex 
real-life case would still keep the allocations separate. Because merging 
the allocations and re-using the same stack for both really is pretty 
non-trivial, and the best solution is to simply not inline.

(And yeah, the above is such an extreme case that gcc seems to realize 
that it makes no sense to inline because the stack frame is _so_ big. I 
don't know what the default stack frame limit is, but it's apparently 
smaller than 1.5kB ;)

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ