lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ztm56ZxiqlTL6ntA@J2N7QTR9R3.cambridge.arm.com>
Date: Thu, 5 Sep 2024 15:03:34 +0100
From: Mark Rutland <mark.rutland@....com>
To: Marc Zyngier <maz@...nel.org>
Cc: Will Deacon <will@...nel.org>,
	syzbot <syzbot+908886656a02769af987@...kaller.appspotmail.com>,
	catalin.marinas@....com, linux-arm-kernel@...ts.infradead.org,
	linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
	ardb@...nel.org, Nathan Chancellor <nathan@...nel.org>,
	Nick Desaulniers <ndesaulniers@...gle.com>,
	Bill Wendling <morbo@...gle.com>,
	Justin Stitt <justinstitt@...gle.com>
Subject: Re: [syzbot] [arm?] upstream test error: KASAN: invalid-access Write
 in setup_arch

[adding Ard and LLVM folk; there's a question right at the end after
some context]

On Sat, Aug 31, 2024 at 06:52:52PM +0100, Marc Zyngier wrote:
> On Fri, 30 Aug 2024 10:52:54 +0100,
> Will Deacon <will@...nel.org> wrote:
> > 
> > On Fri, Aug 30, 2024 at 01:35:24AM -0700, syzbot wrote:
> > > Hello,
> > > 
> > > syzbot found the following issue on:
> > > 
> > > HEAD commit:    33faa93bc856 Merge branch kvmarm-master/next into kvmarm-m..
> > > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git fuzzme
> > 
> > +Marc, as this is his branch.
> >
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1398420b980000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b31c9aa1397ca
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=908886656a02769af987
> > > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > userspace arch: arm64
> 
> As it turns out, this isn't specific to this branch. I can reproduce
> it with this config on a vanilla 6.10 as a KVM guest. Even worse,
> compiling with clang results in an unbootable kernel (without any
> output at all).
> 
> Mind you, the binary is absolutely massive (130MB with gcc, 156MB with
> clang), and I wouldn't be surprised if we were hitting some kind of
> odd limit.

Putting the KASAN issue aside (which I'll handle in a separate thread),
I think there is a real issue here with LLVM.

What's going on here is that .idmap.text ends up more than 128M away
from .head.text, so the 'b primary_entry' at the start of the Image
isn't in range:

| [mark@...rids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text        
| ffff800080000000 g       .head.text     0000000000000000 _text
| [mark@...rids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
| ffff8000889df0e0 g       .rodata.text   000000000000006c primary_entry

... as those are ~128MiB apart.

When building with GCC those end up ~101MiB apart:

| [mark@...rids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text        
| ffff800080000000 g       .head.text     0000000000000000 _text
| [mark@...rids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
| ffff8000865ae0e0 g       .rodata.text   000000000000006c primary_entry

When that happens, LLD makes the header branch to a veneer/thunk:

| ffff800080000000 <_text>:
| ffff800080000000:       fa405a4d        ccmp    x18, #0x0, #0xd, pl     // pl = nfrst
| ffff800080000004:       14003fff        b       ffff800080010000 <__AArch64AbsLongThunk_primary_entry>

... and unfortunately, that veneer/thunk uses a literal with the
statically-linked TTBR1 address of primary_entry:

| ffff800080010000 <__AArch64AbsLongThunk_primary_entry>:
| ffff800080010000:       58000050        ldr     x16, ffff800080010008 <__AArch64AbsLongThunk_primary_entry+0x8>
| ffff800080010004:       d61f0200        br      x16
| ffff800080010008:       889df0e0        .word   0x889df0e0
| ffff80008001000c:       ffff8000        .word   0xffff8000

... so as soon as the CPU tries to branch there it'll take a synchronous
exception since either:

(a) The MMU is off, and that's larger than the physical address size.

(b) The MMU is on, but there's no TTBR1 mapping.

We can bodge around this instance by manually open-coding a veneer with
ADRP+ADD+BR after the header, and having the header branch to that, but
AFAICT we have no guarantee that other early asm or PI C code won't hit
the same problem.

It'd be good if we could convince LLD to use ADRP+ADD, since we already
rely on the entire kernel image falling within 2GiB for data
relocations. I'm not sure if it doesn't support using ADRP+ADD in
veneers or if we're doing something that prevents it from using ADRP+ADD
in the veneer.

By comparison, if I force the branch range to be longer, GCC 14.1.0 and
GNU LD 2.4.20 use ADRP+ADD for the veneer, and the resulting kernel
boots successfully.

I tested that by hacking some .rodata between .head.text and .idmap.text
with:

| char hack_force_veneer[SZ_128M] __ro_after_init;

... which forces a ~230MiB branch range using the config above:

| [mark@...rids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text        
| ffff800080000000 g       .head.text     0000000000000000 _text
| [mark@...rids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
| ffff80008e5be0e0 g       .rodata.text   000000000000006c primary_entry

... with the generated code being:

| ffff800080000000 <__efistub__text>:
| ffff800080000000:       fa405a4d        ccmp    x18, #0x0, #0xd, pl     // pl = nfrst
| ffff800080000004:       14004001        b       ffff800080010008 <__primary_entry_veneer>
...
| ffff800080010008 <__primary_entry_veneer>:
| ffff800080010008:       d0072d70        adrp    x16, ffff80008e5be000 <__idmap_text_start>
| ffff80008001000c:       91038210        add     x16, x16, #0xe0
| ffff800080010010:       d61f0200        br      x16

LLVM folk, is there any existing option to ask LLD to use ADRP+ADD for
the veneer/thunk? ... and if not, would it be possible to add an option
for that?

I realise it shouldn't matter for most users, but it'd be nice to avoid
the boobytrap for anyone building test kernels.

Mark.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ