[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181231072112.21051-1-namit@vmware.com>
Date: Sun, 30 Dec 2018 23:21:06 -0800
From: Nadav Amit <namit@...are.com>
To: Ingo Molnar <mingo@...hat.com>, Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Edward Cree <ecree@...arflare.com>
CC: "H . Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
Nadav Amit <nadav.amit@...il.com>, X86 ML <x86@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Borislav Petkov <bp@...en8.de>,
David Woodhouse <dwmw@...zon.co.uk>,
Nadav Amit <namit@...are.com>
Subject: [RFC v2 0/6] x86: dynamic indirect branch promotion
This is a revised version of optpolines (formerly named retpolines) for
dynamic indirect branch promotion in order to reduce retpoline overheads
[1].
This version address some of the concerns that were raised before.
Accordingly, the code was slightly simplified and patching is now done
using the regular int3/breakpoint mechanism.
Outline optpolines for multiple targets was added. I do not think the
way I implemented it is the correct one. In my original (private)
version, if there are more targets than the outline block can hold, the
outline block is completely removed. However, I think this is
more-or-less how Josh wanted it to be.
The code modifications are now done using a gcc-plugin. This allows to
easily ignore code from init and other code sections. I think it should
also allow us to add opt-in/opt-out support for each branch, for example
by marking function pointers using address-space attributes.
All of these changes required some optimizations to go away to keep the
code simple. I have still did not run the benchmarks again.
So I might have not addressed all the open issues, but it is rather hard
to finish the implementation since some still open high-level decisions
affect the way in which optimizations should be done.
Specifically:
- Is it going to be the only indirect branch promotion mechanism? If so,
it probably should also provide interface similar to Josh's
"static-calls" with annotations.
- Should it also be used when retpolines are disabled (in the config)?
This does complicate the implementation a bit (RFC v1 supported it).
- Is it going to be opt-in or opt-out? If it is an opt-out mechanism,
memory and performance optimizations need to be more aggressive.
- Do we use periodic learning or not? Josh suggested to reconfigure the
branches whenever a new target is found. However, I do not know at
this time how to do learning efficiently, without making learning much
more expensive.
[1] https://lore.kernel.org/patchwork/cover/1001332/
Nadav Amit (6):
x86: introduce kernel restartable sequence
objtool: ignore instructions
x86: patch indirect branch promotion
x86: interface for accessing indirect branch locations
x86: learning and patching indirect branch targets
x86: outline optpoline
arch/x86/Kconfig | 4 +
arch/x86/entry/entry_64.S | 16 +-
arch/x86/include/asm/nospec-branch.h | 83 ++
arch/x86/include/asm/sections.h | 2 +
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/asm-offsets.c | 9 +
arch/x86/kernel/nospec-branch.c | 1293 ++++++++++++++++++
arch/x86/kernel/traps.c | 7 +
arch/x86/kernel/vmlinux.lds.S | 7 +
arch/x86/lib/retpoline.S | 83 ++
include/linux/cpuhotplug.h | 1 +
include/linux/module.h | 9 +
kernel/module.c | 8 +
scripts/Makefile.gcc-plugins | 3 +
scripts/gcc-plugins/x86_call_markup_plugin.c | 329 +++++
tools/objtool/check.c | 21 +-
16 files changed, 1872 insertions(+), 4 deletions(-)
create mode 100644 arch/x86/kernel/nospec-branch.c
create mode 100644 scripts/gcc-plugins/x86_call_markup_plugin.c
--
2.17.1
Powered by blists - more mailing lists