[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190718174110.4635-1-namit@vmware.com>
Date: Thu, 18 Jul 2019 10:41:03 -0700
From: Nadav Amit <namit@...are.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Andy Lutomirski <luto@...nel.org>, x86@...nel.org,
linux-kernel@...r.kernel.org,
Dave Hansen <dave.hansen@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Nadav Amit <namit@...are.com>
Subject: [RFC 0/7] x86/percpu: Use segment qualifiers
GCC 6+ supports segment qualifiers. Using them allows to implement
several optimizations:
1. Avoid unnecessary instructions when an operation is carried on
read/written per-cpu value, and instead allow the compiler to set
instructions that access per-cpu value directly.
2. Make this_cpu_ptr() more efficient and allow its value to be cached,
since preemption must be disabled when this_cpu_ptr() is used.
3. Provide better alternative for this_cpu_read_stable() that caches
values more efficiently using alias attribute to const variable.
4. Allow the compiler to perform other optimizations (e.g. CSE).
5. Use rip-relative addressing in per_cpu_read_stable(), which make it
PIE-ready.
"size" and Peter's compare do not seem to show the impact on code size
reduction correctly. Summing the code size according to nm on defconfig
shows a minor reduction from 11349763 to 11339840 (0.09%).
Nadav Amit (7):
compiler: Report x86 segment support
x86/percpu: Use compiler segment prefix qualifier
x86/percpu: Use C for percpu accesses when possible
x86: Fix possible caching of current_task
percpu: Assume preemption is disabled on per_cpu_ptr()
x86/percpu: Optimized arch_raw_cpu_ptr()
x86/current: Aggressive caching of current
arch/x86/include/asm/current.h | 30 +++
arch/x86/include/asm/fpu/internal.h | 7 +-
arch/x86/include/asm/percpu.h | 293 +++++++++++++++++++------
arch/x86/include/asm/preempt.h | 3 +-
arch/x86/include/asm/resctrl_sched.h | 14 +-
arch/x86/kernel/cpu/Makefile | 1 +
arch/x86/kernel/cpu/common.c | 7 +-
arch/x86/kernel/cpu/current.c | 16 ++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 +-
arch/x86/kernel/process_32.c | 4 +-
arch/x86/kernel/process_64.c | 4 +-
include/asm-generic/percpu.h | 12 +
include/linux/compiler-gcc.h | 4 +
include/linux/compiler.h | 2 +-
include/linux/percpu-defs.h | 33 ++-
15 files changed, 346 insertions(+), 88 deletions(-)
create mode 100644 arch/x86/kernel/cpu/current.c
--
2.17.1
Powered by blists - more mailing lists