lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250227184502.10288-1-chang.seok.bae@intel.com>
Date: Thu, 27 Feb 2025 10:44:45 -0800
From: "Chang S. Bae" <chang.seok.bae@...el.com>
To: linux-kernel@...r.kernel.org
Cc: x86@...nel.org,
	tglx@...utronix.de,
	mingo@...hat.com,
	bp@...en8.de,
	dave.hansen@...ux.intel.com,
	chang.seok.bae@...el.com
Subject: [PATCH RFC v1 00/11] x86: Support Intel Advanced Performance Extensions

Hi all,

This patch series introduces support for Intel's Advanced Performance
Extensions (APX). The goal is to collect early feedback on the approach
taken for APX. Below is a brief overview of the feature and key
considerations.

== Introduction ==

APX introduces a new set of general-purpose registers designed to improve
performance. Currently, these registers are expected to be used primarily
by userspace applications, with no intended use in kernel mode. More
details on its use cases can be found in the published documentation [1].

== Points to Consider ==

In terms of kernel support, some aspects need attention:

* New Register State

  APX register state is managed by the XSAVE instruction set, with its
  state component number assigned to 19, following XTILEDATA.

* XSAVE Buffer Offset

  - In the compacted format used (for the in-kernel buffer), the APX
    state will appear at a later position in the buffer.

  - In the non-compacted format (used for signal, ptrace, and KVM ABIs),
    APX is assigned a lower offset, occupying the space previously
    reserved for the deprecated MPX state.

    This should not bring any ABI changes. In the extended register state
    area, each state's size and offset are determined dynamically via
    CPUID.

* Kernel Assumptions:

  The kernel generally assumes that higher-numbered components have
  higher offsets, as discussed in the following section. Although MPX
  feature usage support was removed [2], its state components (#3 and #4)
  remain supported.

== Areas for Adjustment ==

With that in mind, here are a few key areas that need to be addressed to
properly handle the APX state. While these are thought to be pretty much
at this stage, I'd say it is still open to discover possible hindsights
in other areas.

1. Feature Conflict

   While a valid CPU should not expose both MPX and APX, a broken or
   misconfigured CPU could erroneously expose support for both. This hard
   conflict should be avoided up front.

2. XSAVE Format Conversion

   APX introduces an offset anomaly that requires special handling when
   converting between compacted and non-compacted layouts. The kernel
   relies on XSAVE instructions for context switching and signal
   handling. However, xstate copy functions must correctly translate the
   XSAVE format while ensuring sequential memory access, as enforced by
   struct membuf.

3. XSAVE Size Calculation

   The kernel calculates XSAVE buffer sizes based on the assumption that
   the highest-numbered feature appears at the end. If APX is the last
   bit set in the feature mask, the existing logic in
   xstate_calculate_size() miscalculates the buffer size.

4. Offset Sanity Check

   The kernel's boot-time sanity check in setup_xstate_cache() assumes
   offsets increase with feature numbers. The APX state will conflict
   with this assumption.

== Approaches ==

These two approaches are considerable:

Option 1, Consider APX as a one-off exception

  Initially, I thought this approach would result in fewer changes to the
  xstate code. However, it makes the code less comprehensive (or more
  complicated) and introduces a risk. If another feature similar to APX
  comes up, adding another exception seems to be inefficient and messy.

Option 2, Handle out-of-order offsets in general

  Rather than treating APX as an exception, this approach adapts the
  kernel to accommodate out-of-order offsets. By introducing a feature
  order table and an accompanying macro, the traversal logic is
  encapsulated cleanly, simplifying related code. This makes the kernel
  more resilient to future features with non-sequential offsets.

== Series Summary ==

This patch set addresses the above issues before enabling APX support
The chosen approach (Option 2) adjusts the kernel to handle out-of-order
offsets. Here’s a breakdown of the patches:

* PART1: XSTATE Code Adjustment

  - PATCH1: Clean up xstate enabling messages.
  - PATCH2: Introduce the feature order table.
  - PATCH3: Remove the offset sanity check, as the checked rule is no
            longer true.
  - PATCH4: Adjust XSAVE size calculation.
  - PATCH5: Modify the xstate copy function.

* PART2: APX Enabling

  - PATCH6:  Remove MPX support.
  - PATCH7:  Enumerate the APX CPUID feature bit.
  - PATCH8:  Update xstate definitions to include APX.
  - PATCH9:  Ensure MPX and APX are mutually exclusive.
  - PATCH10: Register APX in the supported xstate list.
  - PATCH11: Add a self-test case for APX.

== Testing ==

The first part is agnostic to the new feature, and the changes should
preserve the same functionality for the currently supported features. The
xstate tests -- covering context switching and ABI compatibility for
signal and ptrace -- were executed to ensure there is no regression.

PATCH11 applies to the same test set and builds on the xstate selftest
rework [3]. Since no hardware implementation is available at this time,
an internal Intel emulator was primarily used to verify the test cases.

---

The patches are based on the x86/fpu branch [4], where the selftest
rework has landed (Thanks, Ingo!). The series can be found here:
    git://github.com/intel/apx.git apx_rfc-v1

Thanks,
Chang

[1] https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html
[2] Commit 45fc24e89b7c ("x86/mpx: remove MPX from arch/x86")
[3] https://lore.kernel.org/lkml/20250226010731.2456-1-chang.seok.bae@intel.com/
[4] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/fpu

Chang S. Bae (11):
  x86/fpu/xstate: Simplify print_xstate_features()
  x86/fpu/xstate: Introduce xstate order table and accessor macro
  x86/fpu/xstate: Remove xstate offset check
  x86/fpu/xstate: Adjust XSAVE buffer size calculation
  x86/fpu/xstate: Adjust xstate copying logic for user ABI
  x86/fpu/mpx: Remove MPX xstate component support
  x86/cpufeatures: Add X86_FEATURE_APX
  x86/fpu/apx: Define APX state component
  x86/fpu/apx: Disallow conflicting MPX presence
  x86/fpu/apx: Enable APX state support
  selftests/x86/apx: Add APX test

 arch/x86/include/asm/cpufeatures.h   |   1 +
 arch/x86/include/asm/fpu/types.h     |   9 ++
 arch/x86/include/asm/fpu/xstate.h    |   5 +-
 arch/x86/kernel/cpu/cpuid-deps.c     |   1 +
 arch/x86/kernel/cpu/scattered.c      |   1 +
 arch/x86/kernel/fpu/xstate.c         | 131 ++++++++++++++++-----------
 tools/testing/selftests/x86/Makefile |   3 +-
 tools/testing/selftests/x86/apx.c    |  10 ++
 tools/testing/selftests/x86/xstate.c |   3 +-
 tools/testing/selftests/x86/xstate.h |   1 +
 10 files changed, 108 insertions(+), 57 deletions(-)
 create mode 100644 tools/testing/selftests/x86/apx.c


base-commit: bd64e9d6aafd12e5437685d2a06360f86418d277
-- 
2.45.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ