[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANpmjNObEzShHvw19EAntPvCYJbqezKBq+pB=mkd7j3sXDEE7A@mail.gmail.com>
Date: Thu, 11 Jul 2024 11:06:12 +0200
From: Marco Elver <elver@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Gatlin Newhouse <gatlin.newhouse@...il.com>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, Kees Cook <keescook@...omium.org>,
Andrey Konovalov <andreyknvl@...il.com>, Andrey Ryabinin <ryabinin.a.a@...il.com>,
Nathan Chancellor <nathan@...nel.org>, Nick Desaulniers <ndesaulniers@...gle.com>,
Bill Wendling <morbo@...gle.com>, Justin Stitt <justinstitt@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>, Baoquan He <bhe@...hat.com>,
Rick Edgecombe <rick.p.edgecombe@...el.com>, Pengfei Xu <pengfei.xu@...el.com>,
Josh Poimboeuf <jpoimboe@...nel.org>, Changbin Du <changbin.du@...wei.com>, Xin Li <xin3.li@...el.com>,
Jason Gunthorpe <jgg@...pe.ca>, Arnd Bergmann <arnd@...db.de>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>, linux-kernel@...r.kernel.org,
kasan-dev@...glegroups.com, linux-hardening@...r.kernel.org,
llvm@...ts.linux.dev, t.p.northover@...il.com,
Fangrui Song <maskray@...gle.com>
Subject: Re: [PATCH v4] x86/traps: Enable UBSAN traps on x86
On Thu, 11 Jul 2024 at 10:10, Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Wed, Jul 10, 2024 at 08:32:38PM +0000, Gatlin Newhouse wrote:
> > Currently ARM architectures extract which specific sanitizer
> > has caused a trap via encoded data in the trap instruction.
> > Clang on x86 currently encodes the same data in ud1 instructions
> > but the x86 handle_bug() and is_valid_bugaddr() functions
> > currently only look at ud2s.
> >
> > Bring x86 to parity with arm64, similar to commit 25b84002afb9
> > ("arm64: Support Clang UBSAN trap codes for better reporting").
> > Enable the reporting of UBSAN sanitizer detail on x86 architectures
> > compiled with clang when CONFIG_UBSAN_TRAP=y.
>
> Can we please get some actual words on what code clang will generate for
> this? This doesn't even refer to the clang commit.
>
> How am I supposed to know if the below patch matches what clang will
> generate etc..
I got curious what the history of this is - I think it was introduced
in https://github.com/llvm/llvm-project/commit/c5978f42ec8e9, which
was reviewed here: https://reviews.llvm.org/D89959
But besides that, there's very little documentation. Either Gatlin or
one of the other LLVM folks might have more background, but we might
be out of luck if that 1 commit is all there is.
[+Cc Tim, author of the LLVM commit]
> > diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
> > index a3ec87d198ac..ccd573d58edb 100644
> > --- a/arch/x86/include/asm/bug.h
> > +++ b/arch/x86/include/asm/bug.h
> > @@ -13,6 +13,17 @@
> > #define INSN_UD2 0x0b0f
> > #define LEN_UD2 2
> >
> > +/*
> > + * In clang we have UD1s reporting UBSAN failures on X86, 64 and 32bit.
> > + */
> > +#define INSN_ASOP 0x67
>
> I asked, but did not receive answer, *WHY* does clang add this silly
> prefix? AFAICT this is entirely spurious and things would be simpler if
> we don't have to deal with it.
>
> > +#define OPCODE_PREFIX 0x0f
>
> This is *NOT* a prefix, it is an escape, please see the SDM Vol 2
> Chapter 'Instruction Format'. That ASOP thing above is a prefix.
>
> > +#define OPCODE_UD1 0xb9
> > +#define OPCODE_UD2 0x0b
>
> These are second byte opcodes. The actual (single byte opcodes) of those
> value exist and are something entirely different (0xB0+r is MOV, and
> 0x0B is OR).
>
> > +#define BUG_NONE 0xffff
> > +#define BUG_UD1 0xfffe
> > +#define BUG_UD2 0xfffd
>
> These are return codes and not related to the defines above and as such
> should be separated from them with some whitespace.
>
> > +
> > #ifdef CONFIG_GENERIC_BUG
> >
> > #ifdef CONFIG_X86_32
> > diff --git a/arch/x86/include/asm/ubsan.h b/arch/x86/include/asm/ubsan.h
> > new file mode 100644
> > index 000000000000..ac2080984e83
> > --- /dev/null
> > +++ b/arch/x86/include/asm/ubsan.h
> > @@ -0,0 +1,23 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_X86_UBSAN_H
> > +#define _ASM_X86_UBSAN_H
> > +
> > +/*
> > + * Clang Undefined Behavior Sanitizer trap mode support.
> > + */
> > +#include <linux/bug.h>
> > +#include <linux/ubsan.h>
> > +#include <asm/ptrace.h>
> > +
> > +/*
> > + * UBSAN uses the EAX register to encode its type in the ModRM byte.
>
> This is a claim, but I have nothing to verify this against. I mean, I
> could go trawl through the clang sources, but this really should be part
> of the changelog to explain the clang code generation.
>
> > + */
> > +#define UBSAN_REG 0x40
>
> This is a ModRM byte, not a REG. The REG encoded therein is 0.
>
> > +
> > +#ifdef CONFIG_UBSAN_TRAP
> > +void handle_ubsan_failure(struct pt_regs *regs, u16 insn);
> > +#else
> > +static inline void handle_ubsan_failure(struct pt_regs *regs, u16 insn) { return; }
> > +#endif /* CONFIG_UBSAN_TRAP */
> > +
> > +#endif /* _ASM_X86_UBSAN_H */
> > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> > index 74077694da7d..fe1d9db27500 100644
> > --- a/arch/x86/kernel/Makefile
> > +++ b/arch/x86/kernel/Makefile
> > @@ -145,6 +145,7 @@ obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o
> > obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o
> >
> > obj-$(CONFIG_CFI_CLANG) += cfi.o
> > +obj-$(CONFIG_UBSAN_TRAP) += ubsan.o
> >
> > obj-$(CONFIG_CALL_THUNKS) += callthunks.o
> >
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index 4fa0b17e5043..b6664016622a 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -67,6 +67,7 @@
> > #include <asm/vdso.h>
> > #include <asm/tdx.h>
> > #include <asm/cfi.h>
> > +#include <asm/ubsan.h>
> >
> > #ifdef CONFIG_X86_64
> > #include <asm/x86_init.h>
> > @@ -91,6 +92,45 @@ __always_inline int is_valid_bugaddr(unsigned long addr)
> > return *(unsigned short *)addr == INSN_UD2;
> > }
> >
> > +/*
> > + * Check for UD1 or UD2, accounting for Address Size Override Prefixes.
> > + * If it's a UD1, get the ModRM byte to pass along to UBSan.
> > + */
> > +__always_inline int decode_bug(unsigned long addr, u32 *imm)
> > +{
> > + u8 v;
> > +
> > + if (addr < TASK_SIZE_MAX)
> > + return BUG_NONE;
> > +
> > + v = *(u8 *)(addr++);
> > + if (v == INSN_ASOP)
> > + v = *(u8 *)(addr++);
> > + if (v != OPCODE_PREFIX)
> > + return BUG_NONE;
> > +
> > + v = *(u8 *)(addr++);
> > + if (v == OPCODE_UD2)
> > + return BUG_UD2;
> > + if (v != OPCODE_UD1)
> > + return BUG_NONE;
> > +
> > + v = *(u8 *)(addr++);
> > + if (X86_MODRM_RM(v) == 4)
> > + addr++;
> > +
> > + *imm = 0;
> > + if (X86_MODRM_MOD(v) == 1)
> > + *imm = *(u8 *)addr;
> > + else if (X86_MODRM_MOD(v) == 2)
> > + *imm = *(u32 *)addr;
> > + else
> > + WARN_ONCE(1, "Unexpected MODRM_MOD: %u\n", X86_MODRM_MOD(v));
> > +
> > + return BUG_UD1;
> > +}
> > +
> > +
> > static nokprobe_inline int
> > do_trap_no_signal(struct task_struct *tsk, int trapnr, const char *str,
> > struct pt_regs *regs, long error_code)
> > @@ -216,6 +256,8 @@ static inline void handle_invalid_op(struct pt_regs *regs)
> > static noinstr bool handle_bug(struct pt_regs *regs)
> > {
> > bool handled = false;
> > + int ud_type;
> > + u32 imm;
> >
> > /*
> > * Normally @regs are unpoisoned by irqentry_enter(), but handle_bug()
> > @@ -223,7 +265,8 @@ static noinstr bool handle_bug(struct pt_regs *regs)
> > * irqentry_enter().
> > */
> > kmsan_unpoison_entry_regs(regs);
> > - if (!is_valid_bugaddr(regs->ip))
> > + ud_type = decode_bug(regs->ip, &imm);
> > + if (ud_type == BUG_NONE)
> > return handled;
> >
> > /*
> > @@ -236,10 +279,14 @@ static noinstr bool handle_bug(struct pt_regs *regs)
> > */
> > if (regs->flags & X86_EFLAGS_IF)
> > raw_local_irq_enable();
> > - if (report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN ||
> > - handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN) {
> > - regs->ip += LEN_UD2;
> > - handled = true;
> > + if (ud_type == BUG_UD2) {
> > + if (report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN ||
> > + handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN) {
> > + regs->ip += LEN_UD2;
> > + handled = true;
> > + }
> > + } else {
> > + handle_ubsan_failure(regs, imm);
> > }
> > if (regs->flags & X86_EFLAGS_IF)
> > raw_local_irq_disable();
> > diff --git a/arch/x86/kernel/ubsan.c b/arch/x86/kernel/ubsan.c
> > new file mode 100644
> > index 000000000000..c90e337a1b6a
> > --- /dev/null
> > +++ b/arch/x86/kernel/ubsan.c
> > @@ -0,0 +1,21 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Clang Undefined Behavior Sanitizer trap mode support.
> > + */
> > +#include <linux/bug.h>
> > +#include <linux/string.h>
> > +#include <linux/printk.h>
> > +#include <linux/ubsan.h>
> > +#include <asm/ptrace.h>
> > +#include <asm/ubsan.h>
> > +
> > +/*
> > + * Checks for the information embedded in the UD1 trap instruction
> > + * for the UB Sanitizer in order to pass along debugging output.
> > + */
> > +void handle_ubsan_failure(struct pt_regs *regs, u16 type)
> > +{
> > + if ((type & 0xFF) == UBSAN_REG)
> > + type >>= 8;
>
> This makes no sense, we've consumed the ModRM byte ealier, this should
> really only ever get the immediate.
>
> > + pr_crit("%s at %pS\n", report_ubsan_failure(regs, type), (void *)regs->ip);
> > +}
> > --
> > 2.25.1
> >
Powered by blists - more mailing lists