linux-kernel - Re: [PATCH v7 12/26] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1817232.MPthNTNLIG@tjmaciei-mobl5>
Date:   Tue, 13 Jul 2021 12:13:16 -0700
From:   Thiago Macieira <thiago.macieira@...el.com>
To:     "Chang S. Bae" <chang.seok.bae@...el.com>
CC:     <bp@...e.de>, <luto@...nel.org>, <tglx@...utronix.de>,
        <mingo@...nel.org>, <x86@...nel.org>, <len.brown@...el.com>,
        <dave.hansen@...el.com>, <jing2.liu@...el.com>,
        <ravi.v.shankar@...el.com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v7 12/26] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state

On Saturday, 10 July 2021 06:02:59 PDT Chang S. Bae wrote:
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index a58800973aed..f45b2cefd6cf 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -1112,6 +1112,44 @@ DEFINE_IDTENTRY(exc_device_not_available)
[cut]
> +                               /* Raise a signal when it failed to handle.
> */ +                               if (err)
> +                                       force_sig(SIGSEGV);
> +                       }
> +                       return;

Hello Chang

Can I make a suggestion that you send a different signal than SIGSEGV for the 
failure of unauthorised instructions? I would recommend SIGILL. Additionally, 
please consider a new ILL_* constant for the si_code field.

I have multiple reasons for that:
1) the XFD failure is not a memory issue, so SIGSEGV is not really 
appropriate, despite coming from an #NM interrupt

2) SIGILL is sent for the AMX instructions in other circumstances, due to CPU 
#UD, notably:
- running on a CPU without AMX support
- running under an OS that did not enable the AMX state in XCR0 (like Linux 
  before this patch series)

When a developer is debugging code and sees a SIGILL on a valid instruction 
stream in disassembly, they know they've got to code they should never have 
got to, bypassing CPU checks. Forgetting to ask for permission is now a 
variant of  that case.

3) the very first AMX instruction to cause the #NM is likely going to be an 
LDTILECFG or TILELOADD, which are memory-related instructions, so may #GP for 
using bad pointers (and LDTILECFG can #GP for bad tile configurations). 
Knowing that the issue was the instruction itself instead of the pointer or 
data being loaded is going to come in handy.

4) SIGSEGV will also be sent for another reason by the kernel. Your cover 
message had:

> 4. Applications touching AMX without permission results in process exit.
> 
>     Armed XFD results in #NM, results in SIGSEGV, typically resulting in
>     process exit.

> 6. NM handler allocation failure results in process exit.
> 
>     If the #NM handler can not allocate the 8KB buffer, the task will
>     receive a SIGSEGV at the instruction that took the #NM fault, typically
>     resulting in process exit.

Knowing that it was caused by reaching code that shouldn't have been reached, 
instead of an OOM issue, is handy.

Do note that this SIGSEGV for allocation is unlikely to happen. If the kernel 
is under memory pressure, the OOM killer will probably kick in and may kill 
(SIGKILL) this process instead. But at least #6 is a legitimate memory issue.

On the same topic, is there a way to save this state in a core dump? The FS 
and GS bases would also be very handy.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering