[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXGMBaD5sVaMyJMKBnMhQ2jD1GE1CXdVKi+KPciv0x5RcQ@mail.gmail.com>
Date: Thu, 26 Sep 2024 12:07:01 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: Andi Kleen <ak@...ux.intel.com>
Cc: Ard Biesheuvel <ardb+git@...gle.com>, linux-kernel@...r.kernel.org, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, Andy Lutomirski <luto@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Uros Bizjak <ubizjak@...il.com>, Dennis Zhou <dennis@...nel.org>, Tejun Heo <tj@...nel.org>,
Christoph Lameter <cl@...ux.com>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Paolo Bonzini <pbonzini@...hat.com>, Vitaly Kuznetsov <vkuznets@...hat.com>,
Juergen Gross <jgross@...e.com>, Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Arnd Bergmann <arnd@...db.de>,
Masahiro Yamada <masahiroy@...nel.org>, Kees Cook <kees@...nel.org>,
Nathan Chancellor <nathan@...nel.org>, Keith Packard <keithp@...thp.com>,
Justin Stitt <justinstitt@...gle.com>, Josh Poimboeuf <jpoimboe@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, hjl.tools@...il.com,
hubicka@....cz
Subject: Re: [RFC PATCH 25/28] x86: Use PIE codegen for the core kernel
On Thu, 26 Sept 2024 at 10:48, Andi Kleen <ak@...ux.intel.com> wrote:
>
>
> On Wed, Sep 25, 2024 at 11:23:39PM +0200, Ard Biesheuvel wrote:
> > > What matters is what it does to general performance.
> > >
> > > Traditionally even on x86-64 PIC/E has a cost and the kernel model
> > > was intended to avoid that.
> > >
> >
> > Is the x86_64 kernel C model specified anywhere, to your knowledge?
>
> The basics are in the ABI. Maybe some of the details of TLS / stack
> protector are missing (I guess that could be fixed, adding HJ)
>
> Some of the motivation was also in early papers like
> https://www.ucw.cz/~hubicka/papers/amd64/amd64.html
>
> I'm copying Honza Hubicka who did the original work.
>
Thanks.
So the psABI has
Kernel code model:
The kernel of an operating system is usually rather small but runs in
the negative half of the address space.
plus an explanation on the ranges of symbolic references.
The problem here is that it is inherently a position dependent code
model, where 'the virtual address of code executed is known at link
time' as per the psABI.
We currently violate that assumption in two different ways:
- the decompressor applies KASLR by using the static ELF relocations
(which are intended for consumption by the linker not the loader) and
use them to relocate the executable to its randomized virtual address;
those static relocations can go out of sync with the actual code when
relaxations are applied;
- some of the startup code is now written in C, but is called via the
1:1 mapping; absolute symbol references don't work in that context,
and we rely on faith and a whole pile of hacks to ensure that this
does not break.
> >
> > > From my perspective this patch kit doesn't fix a real problem,
> > > it's all risk of performance regression with no gain.
> > >
> >
> > It's all in the cover letter and the commit logs so I won't rehash it
> > here, but I understand that your priorities may be different from
> > mine.
>
> It sounded fairly nebulous to me. If Linux wanted to support a third tool chain
> and it didn't support the kernel model yet it would be somehow easier.
> Apart from the kernel model likely being one of the minor issues
> in such a endeavour, I don't see a third tool chain othan than gcc and llvm
> anywhere on the horizon?
>
I was referring to Rust, which is llvm based but will also have a GCC
based alternative (gcc-rs) in the future. I am aware that these will
most likely reuse most of the existing backends, where these concerns
have the most impact, but they still need wider consideration than
they used to in the past.
So the tl;dr for the rationale behind this series is that it is better
to use a code model and a relocation model that
a) matches the reality of how our code operates, and
b) deviates as little as possible from how code is generally
constructed for user space.
On top of that, it would be better to use relocation metadata that is
intended for consumption by a runtime loader, rather than rolling our
own based on a relocation format that is intended for consumption by
the build time linker.
> >
> > I'll provide some numbers about the impact on code size. Are there any
> > other performance related aspects that you think might be impacted by
> > the use of position independent code generation?
>
> Code size isn't a sufficient metric either.
>
> Linux sometimes goes to great length for small gains, for example
> there was a huge effort to avoid frame pointers, even though it's a
> small percentage delta. PIC could well be larger than frame pointers.
>
> You need to run it with some real workloads, e.g. some of the kernel
> oriented workloads in 0day or phoronix, and see if there are
> performance regressions.
>
> Unfortunately for an intrusive change like this this might also vary for
> different CPUs, so may need some more coverage.
>
I'll spend some time to look into this in more detail (and get some
help internally at Google to measure the real-world impact)
Powered by blists - more mailing lists