[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9bde3a6-1e19-1340-1fda-bc6de2eb4f7c@kernel.org>
Date: Tue, 25 Feb 2020 21:29:00 -0800
From: Andy Lutomirski <luto@...nel.org>
To: Frederic Weisbecker <frederic@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
Steven Rostedt <rostedt@...dmis.org>,
Brian Gerst <brgerst@...il.com>,
Juergen Gross <jgross@...e.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Arnd Bergmann <arnd@...db.de>
Subject: Re: [patch 02/10] x86/mce: Disable tracing and kprobes on
do_machine_check()
On 2/25/20 5:13 PM, Frederic Weisbecker wrote:
> On Tue, Feb 25, 2020 at 10:36:38PM +0100, Thomas Gleixner wrote:
>> From: Andy Lutomirski <luto@...nel.org>
>>
>> do_machine_check() can be raised in almost any context including the most
>> fragile ones. Prevent kprobes and tracing.
>>
>> Signed-off-by: Andy Lutomirski <luto@...nel.org>
>> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
>> ---
>> arch/x86/include/asm/traps.h | 3 ---
>> arch/x86/kernel/cpu/mce/core.c | 12 ++++++++++--
>> 2 files changed, 10 insertions(+), 5 deletions(-)
>>
>> --- a/arch/x86/include/asm/traps.h
>> +++ b/arch/x86/include/asm/traps.h
>> @@ -88,9 +88,6 @@ dotraplinkage void do_page_fault(struct
>> dotraplinkage void do_spurious_interrupt_bug(struct pt_regs *regs, long error_code);
>> dotraplinkage void do_coprocessor_error(struct pt_regs *regs, long error_code);
>> dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code);
>> -#ifdef CONFIG_X86_MCE
>> -dotraplinkage void do_machine_check(struct pt_regs *regs, long error_code);
>> -#endif
>> dotraplinkage void do_simd_coprocessor_error(struct pt_regs *regs, long error_code);
>> #ifdef CONFIG_X86_32
>> dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code);
>> --- a/arch/x86/kernel/cpu/mce/core.c
>> +++ b/arch/x86/kernel/cpu/mce/core.c
>> @@ -1213,8 +1213,14 @@ static void __mc_scan_banks(struct mce *
>> * On Intel systems this is entered on all CPUs in parallel through
>> * MCE broadcast. However some CPUs might be broken beyond repair,
>> * so be always careful when synchronizing with others.
>> + *
>> + * Tracing and kprobes are disabled: if we interrupted a kernel context
>> + * with IF=1, we need to minimize stack usage. There are also recursion
>> + * issues: if the machine check was due to a failure of the memory
>> + * backing the user stack, tracing that reads the user stack will cause
>> + * potentially infinite recursion.
>> */
>> -void do_machine_check(struct pt_regs *regs, long error_code)
>> +void notrace do_machine_check(struct pt_regs *regs, long error_code)
>> {
>> DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
>> DECLARE_BITMAP(toclear, MAX_NR_BANKS);
>> @@ -1360,6 +1366,7 @@ void do_machine_check(struct pt_regs *re
>> ist_exit(regs);
>> }
>> EXPORT_SYMBOL_GPL(do_machine_check);
>> +NOKPROBE_SYMBOL(do_machine_check);
>
> That won't protect all the function called by do_machine_check(), right?
> There are lots of them.
>
It at least means we can survive to run actual C code in
do_machine_check(), which lets us try to mitigate this issue further.
PeterZ has patches for that, and maybe this series fixes it later on.
(I'm reading in order!)
Powered by blists - more mailing lists