lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a41d7012-2eea-436e-9f7e-6a7702f7e2c2@intel.com>
Date: Thu, 2 May 2024 13:59:30 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: "Chang S. Bae" <chang.seok.bae@...el.com>, linux-kernel@...r.kernel.org
Cc: x86@...nel.org, platform-driver-x86@...r.kernel.org, tglx@...utronix.de,
 mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
 hdegoede@...hat.com, ilpo.jarvinen@...ux.intel.com, tony.luck@...el.com,
 ashok.raj@...el.com, jithu.joseph@...el.com
Subject: Re: [PATCH 0/2] x86/fpu: Extend kernel_fpu_begin_mask() for the
 In-Field Scan driver

On 4/30/24 14:25, Chang S. Bae wrote:
> The recent update [1] in the SDM highlights the requirement of
> initializing the AMX state for executing the scan test:
>     "... maintaining AMX state in a non-initialized state ... will
>      prevent the execution of In-Field Scan tests."
> which is one of CPU state conditions required for the test's execution.

This ended up just being phrased weirdly.  It's a lot more compact to
just say:

	AMX must be in its init state for In-Field Scan tests to run.

> In situations where AMX workloads are running, the switched-away active
> user AMX state remains due to the optimization to reduce the state
> switching cost. A user state reload is fully completed right before
> returning to userspace. Consequently, if the switched-in kernel task is
> executing the scan test, this non-initialized AMX state causes the test
> to be unable to start.

FPU state in general (and AMX state in particular) is large and
expensive to context switch so the kernel tries to leave FPU state alone
 even while running kernel tasks.  But, this behavior obviously
conflicts with the (new) IFS need for AMX must be in its init state.

Right?

..
> [1] Intel Software Development Manual as of March 2024, Section 18.2
>     RECOMMENDATIONS FOR SYSTEM SOFTWARE of Vol. 1.
>     https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html

Rather than, "we're just following the spec", I think there can be a
better explanation here.

These in-field scan tests (IFS) poke the hardware in unique ways. It
ends up that IFS and AMX could attempt to use the same hardware
resources at the same time and step on each other.  While it would be
possible to add additional resources to the CPU to allow simultaneous
AMX and IFS, the hardware to do this would be relatively expensive.  It
seems pretty reasonable for software to help out here.

The other argument that could be made is that an admin could isolate the
CPUs on which they wanted to run an IFS test.  They could use cpusets,
or even the task binding API to try and evict AMX workloads from these
CPUs.  But, the promise of IFS is that it can be run without disturbing
workloads _too_ much.  Basically anything an admin would do is probably
too onerous and high-impact.

So this mechanism provides two things: One, it makes the hardware
simpler and two, it takes the admin out of the picture.  Thing just work.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ