lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190718174110.4635-1-namit@vmware.com>
Date:   Thu, 18 Jul 2019 10:41:03 -0700
From:   Nadav Amit <namit@...are.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Andy Lutomirski <luto@...nel.org>, x86@...nel.org,
        linux-kernel@...r.kernel.org,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Nadav Amit <namit@...are.com>
Subject: [RFC 0/7] x86/percpu: Use segment qualifiers 

GCC 6+ supports segment qualifiers. Using them allows to implement
several optimizations:

1. Avoid unnecessary instructions when an operation is carried on
read/written per-cpu value, and instead allow the compiler to set
instructions that access per-cpu value directly.

2. Make this_cpu_ptr() more efficient and allow its value to be cached,
since preemption must be disabled when this_cpu_ptr() is used.

3. Provide better alternative for this_cpu_read_stable() that caches
values more efficiently using alias attribute to const variable.

4. Allow the compiler to perform other optimizations (e.g. CSE).

5. Use rip-relative addressing in per_cpu_read_stable(), which make it
PIE-ready.

"size" and Peter's compare do not seem to show the impact on code size
reduction correctly. Summing the code size according to nm on defconfig
shows a minor reduction from 11349763 to 11339840 (0.09%).

Nadav Amit (7):
  compiler: Report x86 segment support
  x86/percpu: Use compiler segment prefix qualifier
  x86/percpu: Use C for percpu accesses when possible
  x86: Fix possible caching of current_task
  percpu: Assume preemption is disabled on per_cpu_ptr()
  x86/percpu: Optimized arch_raw_cpu_ptr()
  x86/current: Aggressive caching of current

 arch/x86/include/asm/current.h         |  30 +++
 arch/x86/include/asm/fpu/internal.h    |   7 +-
 arch/x86/include/asm/percpu.h          | 293 +++++++++++++++++++------
 arch/x86/include/asm/preempt.h         |   3 +-
 arch/x86/include/asm/resctrl_sched.h   |  14 +-
 arch/x86/kernel/cpu/Makefile           |   1 +
 arch/x86/kernel/cpu/common.c           |   7 +-
 arch/x86/kernel/cpu/current.c          |  16 ++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |   4 +-
 arch/x86/kernel/process_32.c           |   4 +-
 arch/x86/kernel/process_64.c           |   4 +-
 include/asm-generic/percpu.h           |  12 +
 include/linux/compiler-gcc.h           |   4 +
 include/linux/compiler.h               |   2 +-
 include/linux/percpu-defs.h            |  33 ++-
 15 files changed, 346 insertions(+), 88 deletions(-)
 create mode 100644 arch/x86/kernel/cpu/current.c

-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ