[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3181031.RqgVF4sTRC@tjmaciei-mobl5>
Date: Wed, 18 Aug 2021 10:20:58 -0700
From: Thiago Macieira <thiago.macieira@...el.com>
To: "Chang S. Bae" <chang.seok.bae@...el.com>,
Borislav Petkov <bp@...en8.de>
CC: <luto@...nel.org>, <tglx@...utronix.de>, <mingo@...nel.org>,
<x86@...nel.org>, <len.brown@...el.com>, <dave.hansen@...el.com>,
<jing2.liu@...el.com>, <ravi.v.shankar@...el.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v9 12/26] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state
On Wednesday, 18 August 2021 09:24:51 PDT Borislav Petkov wrote:
> > +#define X86_FEATURE_XFD (10*32+ 4) /* eXtended
> > Feature Disabling */
>
> Add "" at the marker above - it doesn't look like we wanna show "xfd" in
> /proc/cpuinfo.
Why not?
It could help diagnosing why this code has a failure if XFD is somehow
missing. That can happen with hypervisors or future CPUs.
> > + /* Raise a signal when it failed to
> > handle. */ + if (err)
> > + force_sig_fault(SIGILL,
> > ILL_ILLOPC,
> > +
> > error_get_trap_addr(regs));>
> Where is it documented that that configuration of SIG* types means,
> failure to allocate the dynamic buffer?
This wasn't part of the memory failure, but now that you've pointed out, yes,
we are getting a SIGILL in case the kernel failed to allocate memory too.
This is the same code path we get if the task executes an AMX instruction
without first requesting support for it via the system call. At my request,
Chang changed it from SIGSEGV to SIGILL, because that's the behaviour one
would see if the kernel did not support AMX at all, hadn't enabled it in XCR0
or the CPU didn't support the instructions.
I don't know how to best handle killing the application if the kernel is OOM
(see below, though). Maybe it should SIGKILL instead. The problem with sending
a SIGSEGV is debuggability: if I get a core dump of this crash, which is
likely going to happen in a load instruction, I'll spend a lot time trying to
understand why the pointer in that instruction wasn't correct. Very few people
will ever consider it may have another reason.
> To the general picture: why is this thing even allocating a buffer in #NM?
>
> Why isn't the buffer pre-allocated for the process after latter having
> done prctl() so that when an #NM happens, no allocation happens at all?
That's a good question, but I thought it had been discussed and agreed that we
didn't want to extend the buffers at the moment the application requested the
bits, because it may never need them. This was probably a defence against
applications requesting all bits without knowing whether they'll need them at
all.
The way the API to userspace is implemented, the only way to find out if the
kernel supports a given state is to enable it. It's not currently possible to
ask "do you support AMX tile data?" and then go about the application's merry
way until it determines it really wants to do matrix multiplications. In the
case of applications with plugins, they need to have that answer before the
load the plugin, which usually happens at application start.
I was going to suggest a new API to return the supported bits, but hadn't yet
because it wasn't required for this patchset to work. So long as that API
landed at or before the time a new bit was added, userspace would be able to
cope. But if the kernel is going to allocate the bits at the moment of the
system call *and* we wish for userspace not to request more than it really
needs, then we'll need this extra API right now.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel DPG Cloud Engineering
Powered by blists - more mailing lists