lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220330123205.GL8939@worktop.programming.kicks-ass.net>
Date:   Wed, 30 Mar 2022 14:32:05 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Tony Luck <tony.luck@...el.com>
Cc:     Borislav Petkov <bp@...en8.de>,
        Josh Poimboeuf <jpoimboe@...hat.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, Zhiquan Li <zhiquan1.li@...el.com>,
        Youquan Song <youquan.song@...el.com>
Subject: Re: [PATCH] x86/uaccess: restore get_user exception type to
 EX_TYPE_UACCESS

On Mon, Mar 28, 2022 at 01:17:48PM -0700, Tony Luck wrote:
> From: Zhiquan Li <zhiquan1.li@...el.com>
> 
> 5.17.0 kernel will crash when we inject MCE by run "einj_mem_uc copyin"
> in ras-tools with CONFIG_CC_HAS_ASM_GOTO_OUTPUT != y kernel config.
> mce: [Hardware Error]: Machine check events logged
> mce: [Hardware Error]: CPU 120: Machine Check Exception: f Bank 1: bd80000000100134
> mce: [Hardware Error]: RIP 10: {fault_in_readable+0x9f/0xd0}
> mce: [Hardware Error]: TSC 63d3fa6181b69 ADDR f921f31400 MISC 86 PPIN 11a090eb80bf0c9c
> mce: [Hardware Error]: PROCESSOR 0:606a6 TIME 1647365323 SOCKET 1 APIC 8d microcode d0002e0
> mce: [Hardware Error]: Run the above through 'mcelog --ascii'
> mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel
> Kernel panic - not syncing: Fatal local machine check
> 
> In commit 99641e094d6c ("x86/uaccess: Remove .fixup usage"), the
> exception type of get_user was changed from EX_TYPE_UACCESS to
> EX_TYPE_EFAULT_REG. In case of MCE/SRAR when kernel copy data from user,
> the MCE handler identities the exception type with EX_TYPE_UACCESS to
> MCE_IN_KERNEL_RECOV. While the new type EX_TYPE_EFAULT_REG will lose
> lose the opportunity to rescue the system.

This would've been ever so much more useful if it would've explained
where this magic happens.... also *urgh*.

So basically the MCE handler is doing a extable lookup on the sly to
figure out if the instruction did a user-access ? Why isn't there a
comment along with the exception crap that explains this?

Is this really the only UACCESS I lost in all that rework?

Also, MCE handler could decode the instruction and look at register
content to determine if a userspace address was involved.

> This patch works ... but to test it I had to fake out init/Kconfig so
> that it wouldn't set CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y. So it seems that
> this is only needed when building with some old compiler version.

Did you do your testing on RHEL or something daft like that?

> With Linus' announcement about C99/C11 as new basis, is this fix
> needed? I.e. is it still valid to build the upstream kernel with a
> compiler that doesn't grok CONFIG_CC_HAS_ASM_GOTO_OUTPUT?

Sadly, yes, ASM_GOTO_OUTPUT is gcc-11, while we still support gcc-5.1 or
something ancient like that.

>  arch/x86/include/asm/extable_fixup_types.h |  1 +
>  arch/x86/include/asm/uaccess.h             | 15 +++++++++------
>  arch/x86/mm/extable.c                      |  8 ++++++++
>  3 files changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/include/asm/extable_fixup_types.h b/arch/x86/include/asm/extable_fixup_types.h
> index 503622627400..329eeebba2f6 100644
> --- a/arch/x86/include/asm/extable_fixup_types.h
> +++ b/arch/x86/include/asm/extable_fixup_types.h
> @@ -30,6 +30,7 @@
>  #define EX_FLAG_CLEAR_AX		EX_DATA_FLAG(1)
>  #define EX_FLAG_CLEAR_DX		EX_DATA_FLAG(2)
>  #define EX_FLAG_CLEAR_AX_DX		EX_DATA_FLAG(3)
> +#define EX_FLAG_SET_REG		EX_DATA_FLAG(4)

That's the last available flag.. :/

Something like the below can also work, I suppose. But please, add
coherent comments to the extable code with useful references to the MCE
code that does this abuse.


diff --git a/arch/x86/include/asm/extable_fixup_types.h b/arch/x86/include/asm/extable_fixup_types.h
index 503622627400..759283acb246 100644
--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -64,4 +64,7 @@
 #define	EX_TYPE_UCOPY_LEN4		(EX_TYPE_UCOPY_LEN | EX_DATA_IMM(4))
 #define	EX_TYPE_UCOPY_LEN8		(EX_TYPE_UCOPY_LEN | EX_DATA_IMM(8))
 
+#define	EX_TYPE_UA_IMM_REG		20 /* reg := (long)imm */
+#define	EX_TYPE_UFAULT_REG		(EX_TYPE_UA_IMM_REG | EX_DATA_IMM(-EFAULT))
+
 #endif
diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
index dba2197c05c3..b9bc0e7cb73e 100644
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -210,6 +210,7 @@ int fixup_exception(struct pt_regs *regs, int trapnr, unsigned long error_code,
 		regs->sp += sizeof(long);
 		fallthrough;
 	case EX_TYPE_IMM_REG:
+	case EX_TYPE_UA_IMM_REG:
 		return ex_handler_imm_reg(e, regs, reg, imm);
 	case EX_TYPE_FAULT_SGX:
 		return ex_handler_sgx(e, regs, trapnr);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ