linux-kernel - Re: [PATCH v4] x86/traps: Enable UBSAN traps on x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <202407110924.81A08DD4D@keescook>
Date: Thu, 11 Jul 2024 09:35:24 -0700
From: Kees Cook <kees@...nel.org>
To: Marco Elver <elver@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
	Gatlin Newhouse <gatlin.newhouse@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrey Konovalov <andreyknvl@...il.com>,
	Andrey Ryabinin <ryabinin.a.a@...il.com>,
	Nathan Chancellor <nathan@...nel.org>,
	Nick Desaulniers <ndesaulniers@...gle.com>,
	Bill Wendling <morbo@...gle.com>,
	Justin Stitt <justinstitt@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Baoquan He <bhe@...hat.com>,
	Rick Edgecombe <rick.p.edgecombe@...el.com>,
	Pengfei Xu <pengfei.xu@...el.com>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Changbin Du <changbin.du@...wei.com>, Xin Li <xin3.li@...el.com>,
	Jason Gunthorpe <jgg@...pe.ca>, Arnd Bergmann <arnd@...db.de>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	linux-kernel@...r.kernel.org, kasan-dev@...glegroups.com,
	linux-hardening@...r.kernel.org, llvm@...ts.linux.dev,
	t.p.northover@...il.com, Fangrui Song <maskray@...gle.com>
Subject: Re: [PATCH v4] x86/traps: Enable UBSAN traps on x86

On Thu, Jul 11, 2024 at 11:06:12AM +0200, Marco Elver wrote:
> On Thu, 11 Jul 2024 at 10:10, Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Wed, Jul 10, 2024 at 08:32:38PM +0000, Gatlin Newhouse wrote:
> > > Currently ARM architectures extract which specific sanitizer
> > > has caused a trap via encoded data in the trap instruction.
> > > Clang on x86 currently encodes the same data in ud1 instructions
> > > but the x86 handle_bug() and is_valid_bugaddr() functions
> > > currently only look at ud2s.
> > >
> > > Bring x86 to parity with arm64, similar to commit 25b84002afb9
> > > ("arm64: Support Clang UBSAN trap codes for better reporting").
> > > Enable the reporting of UBSAN sanitizer detail on x86 architectures
> > > compiled with clang when CONFIG_UBSAN_TRAP=y.
> >
> > Can we please get some actual words on what code clang will generate for
> > this? This doesn't even refer to the clang commit.
> >
> > How am I supposed to know if the below patch matches what clang will
> > generate etc..
> 
> I got curious what the history of this is - I think it was introduced
> in https://github.com/llvm/llvm-project/commit/c5978f42ec8e9, which
> was reviewed here: https://reviews.llvm.org/D89959

Sorry, I should have suggested this commit be mentioned in the commit
log. The details are in llvm/lib/Target/X86/X86MCInstLower.cpp:
https://github.com/llvm/llvm-project/commit/c5978f42ec8e9#diff-bb68d7cd885f41cfc35843998b0f9f534adb60b415f647109e597ce448e92d9f

  case X86::UBSAN_UD1:
    EmitAndCountInstruction(MCInstBuilder(X86::UD1Lm)
                                .addReg(X86::EAX)
                                .addReg(X86::EAX)
                                .addImm(1)
                                .addReg(X86::NoRegister)
                                .addImm(MI->getOperand(0).getImm())
                                .addReg(X86::NoRegister));

Which is using the UD1Lm template from
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InstrSystem.td#L27

  def UD1Lm   : I<0xB9, MRMSrcMem, (outs), (ins GR32:$src1, i32mem:$src2),
                  "ud1{l}\t{$src2, $src1|$src1, $src2}", []>, TB, OpSize32;

It uses OpSize32, distinct from UD1Wm (16) and UD1Qm (64).

> But besides that, there's very little documentation. Either Gatlin or
> one of the other LLVM folks might have more background, but we might
> be out of luck if that 1 commit is all there is.
> 
> [+Cc Tim, author of the LLVM commit]
> 
> > > diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
> > > index a3ec87d198ac..ccd573d58edb 100644
> > > --- a/arch/x86/include/asm/bug.h
> > > +++ b/arch/x86/include/asm/bug.h
> > > @@ -13,6 +13,17 @@
> > >  #define INSN_UD2     0x0b0f
> > >  #define LEN_UD2              2
> > >
> > > +/*
> > > + * In clang we have UD1s reporting UBSAN failures on X86, 64 and 32bit.
> > > + */
> > > +#define INSN_ASOP    0x67
> >
> > I asked, but did not receive answer, *WHY* does clang add this silly
> > prefix? AFAICT this is entirely spurious and things would be simpler if
> > we don't have to deal with it.

Even if we change LLVM, I'd still like to support the older versions, so
we'll need to handle this regardless.

> >
> > > +#define OPCODE_PREFIX        0x0f
> >
> > This is *NOT* a prefix, it is an escape, please see the SDM Vol 2
> > Chapter 'Instruction Format'. That ASOP thing above is a prefix.
> >
> > > +#define OPCODE_UD1   0xb9
> > > +#define OPCODE_UD2   0x0b
> >
> > These are second byte opcodes. The actual (single byte opcodes) of those
> > value exist and are something entirely different (0xB0+r is MOV, and
> > 0x0B is OR).

What would be your preferred names for all of these defines?

-- 
Kees Cook