linux-ext4 - Re: [syzbot] [ext4?] general protection fault in hrtimer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANp29Y5BnnYBauXyHmUKrgrn5LZpz8nDuZFTwLLB7WHq4DS6Wg@mail.gmail.com>
Date: Thu, 9 Nov 2023 21:00:18 -0800
From: Aleksandr Nogikh <nogikh@...gle.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: syzbot <syzbot+b408cd9b40ec25380ee1@...kaller.appspotmail.com>, 
	adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, 
	syzkaller-bugs@...glegroups.com, tytso@....edu
Subject: Re: [syzbot] [ext4?] general protection fault in hrtimer_nanosleep

On Thu, Nov 2, 2023 at 8:57 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>
> On Thu, Nov 02 2023 at 13:08, Aleksandr Nogikh wrote:
> > On Wed, Nov 1, 2023 at 1:58 PM Thomas Gleixner <tglx@...utronix.de> wrote:
> >> Unfortunately repro.syz does not hold up to its name and refuses to
> >> reproduce.
> >
> > For me, on a locally built kernel (gcc 13.2.0) it didn't work either.
> >
> > But, interestingly, it does reproduce using the syzbot-built kernel
> > shared via the "Downloadable assets" [1] in the original report. The
> > repro crashed the kernel in ~1 minute.
> >
> > [1] https://github.com/google/syzkaller/blob/master/docs/syzbot_assets.md
> >
> > [  125.919060][    C0] BUG: KASAN: stack-out-of-bounds in rb_next+0x10a/0x130
> > [  125.921169][    C0] Read of size 8 at addr ffffc900048e7c60 by task
> > kworker/0:1/9
> > [  125.923235][    C0]
> > [  125.923243][    C0] CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted
> > 6.6.0-rc7-syzkaller-00142-g888cf78c29e2 #0
> > [  125.924546][    C0] Hardware name: QEMU Standard PC (Q35 + ICH9,
> > 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > [  125.926915][    C0] Workqueue: events nsim_dev_trap_report_work
> > [  125.929333][    C0]
> > [  125.929341][    C0] Call Trace:
> > [  125.929350][    C0]  <IRQ>
> > [  125.929356][    C0]  dump_stack_lvl+0xd9/0x1b0
> > [  125.931302][    C0]  print_report+0xc4/0x620
> > [  125.932115][    C0]  ? __virt_addr_valid+0x5e/0x2d0
> > [  125.933194][    C0]  kasan_report+0xda/0x110
> > [  125.934814][    C0]  ? rb_next+0x10a/0x130
> > [  125.936521][    C0]  ? rb_next+0x10a/0x130
> > [  125.936544][    C0]  rb_next+0x10a/0x130
> > [  125.936565][    C0]  timerqueue_del+0xd4/0x140
> > [  125.936590][    C0]  __remove_hrtimer+0x99/0x290
> > [  125.936613][    C0]  __hrtimer_run_queues+0x55b/0xc10
> > [  125.936638][    C0]  ? enqueue_hrtimer+0x310/0x310
> > [  125.936659][    C0]  ? ktime_get_update_offsets_now+0x3bc/0x610
> > [  125.936688][    C0]  hrtimer_interrupt+0x31b/0x800
> > [  125.936715][    C0]  __sysvec_apic_timer_interrupt+0x105/0x3f0
> > [  125.936737][    C0]  sysvec_apic_timer_interrupt+0x8e/0xc0
> > [  125.936755][    C0]  </IRQ>
> > [  125.936759][    C0]  <TASK>
>
> Which is a completely different failure mode.
>
> It explodes in the hrtimer interrupt when dequeuing an hrtimer for
> expiry. That means the corresponding embedded rb_node is corrupted,
> which points to random data corruption.
>
> As you can reproduce (it still fails here with the provided assets),
> does the failure change when you run it several times?

Hmm, it's weird. Maybe I was very lucky that time.

The reproducer does work on the attached disk image, but definitely
not very often. I've just run it 10 times or so and got interleaved
BUG/KFENCE bug reports like this (twice):
https://pastebin.com/W0TkRsnw

These seem to be related to ext4 rather than hrtimers though.

-- 
Aleksandr

>
> Thanks,
>
>         tglx