[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120820101044.GE16230@one.firstfloor.org>
Date: Mon, 20 Aug 2012 12:10:44 +0200
From: Andi Kleen <andi@...stfloor.org>
To: Ingo Molnar <mingo@...nel.org>
Cc: Andi Kleen <andi@...stfloor.org>, linux-kernel@...r.kernel.org,
x86@...nel.org, mmarek@...e.cz, linux-kbuild@...r.kernel.org,
JBeulich@...e.com, akpm@...ux-foundation.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: RFC: Link Time Optimization support for the kernel
On Mon, Aug 20, 2012 at 09:48:35AM +0200, Ingo Molnar wrote:
>
> * Andi Kleen <andi@...stfloor.org> wrote:
>
> > This rather large patchkit enables gcc Link Time Optimization (LTO)
> > support for the kernel.
> >
> > With LTO gcc will do whole program optimizations for
> > the whole kernel and each module. This increases compile time,
> > but can generate faster code.
>
> By how much does it increase compile time?
All numbers are preliminary at this point. I miss both some code
quality and compile time improvements that it could do, to work
around some issues that are fixable.
Compile time:
Compilation slowdown depends on the largest binary size. I see between
50% and 4x. The 4x case is mainly for allyes (so unlikely); a normal
distro build, which is mostly modular, or a defconfig like build is more
towards the 50%.
Currently I have to disable slim LTO, which essentially means everything
is compiled twice. Once that's fixed it should compile faster for
the normal case too (although it will be still slower than non LTO)
A lot of the overhead on the larger builds is also some specific
gcc code that I'm working with the gcc developers on to improve.
So the 4x extreme case will hopefully go down.
The large builds also currently suffer from too much memory
consumption. That will hopefully improve too, as gcc improves.
I wouldn't expect anyone using it for day to day kernel hacking
(I understand that 50% are annoying for that). It's more like a
"release build" mode.
The performance is currently also missing some improvements due
to workarounds.
Performance:
Hackbench goes about 5% faster, so the scheduler benefits. Kbuild
is not changing much. Various network benchmarks over loopback
go faster too (best case seen 18%+), so the network stack seems to
benefit. A lot of micro benchmarks go faster, sometimes larger numbers.
There are some minor regressions.
A lot of benchmarking on larger workloads is still outstanding.
But the existing numbers are promising I believe. Things will still
change, it's still early.
I would welcome any benchmarking from other people.
I also expect gcc to do more LTO optimizations in the future, so we'll
hopefully see more gains over time. Essentially it gives more
power to the compiler.
Long term it would also help the kernel source organization. For example
there's no reason with LTO to have gigantic includes with large inlines,
because cross file inlining works in a efficient way without reparsing.
In theory (but that's not realized today) the automatic repartitioning of
compilation units could improve compile time with lots of small files
-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists