lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 13 Jul 2021 12:13:16 -0700
From:   Thiago Macieira <thiago.macieira@...el.com>
To:     "Chang S. Bae" <chang.seok.bae@...el.com>
CC:     <bp@...e.de>, <luto@...nel.org>, <tglx@...utronix.de>,
        <mingo@...nel.org>, <x86@...nel.org>, <len.brown@...el.com>,
        <dave.hansen@...el.com>, <jing2.liu@...el.com>,
        <ravi.v.shankar@...el.com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v7 12/26] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state

On Saturday, 10 July 2021 06:02:59 PDT Chang S. Bae wrote:
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index a58800973aed..f45b2cefd6cf 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -1112,6 +1112,44 @@ DEFINE_IDTENTRY(exc_device_not_available)
[cut]
> +                               /* Raise a signal when it failed to handle.
> */ +                               if (err)
> +                                       force_sig(SIGSEGV);
> +                       }
> +                       return;

Hello Chang

Can I make a suggestion that you send a different signal than SIGSEGV for the 
failure of unauthorised instructions? I would recommend SIGILL. Additionally, 
please consider a new ILL_* constant for the si_code field.

I have multiple reasons for that:
1) the XFD failure is not a memory issue, so SIGSEGV is not really 
appropriate, despite coming from an #NM interrupt

2) SIGILL is sent for the AMX instructions in other circumstances, due to CPU 
#UD, notably:
- running on a CPU without AMX support
- running under an OS that did not enable the AMX state in XCR0 (like Linux 
  before this patch series)

When a developer is debugging code and sees a SIGILL on a valid instruction 
stream in disassembly, they know they've got to code they should never have 
got to, bypassing CPU checks. Forgetting to ask for permission is now a 
variant of  that case.

3) the very first AMX instruction to cause the #NM is likely going to be an 
LDTILECFG or TILELOADD, which are memory-related instructions, so may #GP for 
using bad pointers (and LDTILECFG can #GP for bad tile configurations). 
Knowing that the issue was the instruction itself instead of the pointer or 
data being loaded is going to come in handy.

4) SIGSEGV will also be sent for another reason by the kernel. Your cover 
message had:

> 4. Applications touching AMX without permission results in process exit.
> 
>     Armed XFD results in #NM, results in SIGSEGV, typically resulting in
>     process exit.

> 6. NM handler allocation failure results in process exit.
> 
>     If the #NM handler can not allocate the 8KB buffer, the task will
>     receive a SIGSEGV at the instruction that took the #NM fault, typically
>     resulting in process exit.

Knowing that it was caused by reaching code that shouldn't have been reached, 
instead of an OOM issue, is handy.

Do note that this SIGSEGV for allocation is unlikely to happen. If the kernel 
is under memory pressure, the OOM killer will probably kick in and may kill 
(SIGKILL) this process instead. But at least #6 is a legitimate memory issue.


On the same topic, is there a way to save this state in a core dump? The FS 
and GS bases would also be very handy.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ