linux-kernel - Re: [syzbot] BUG: unable to handle kernel access to user memory in schedule

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+ay21Cw8TtUdyDAzXAJaqpDPyCKNW6XF1GKsHoNeL=qKw@mail.gmail.com>
Date:   Fri, 12 Mar 2021 18:38:57 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Ben Dooks <ben.dooks@...ethink.co.uk>
Cc:     syzbot <syzbot+e74b94fe601ab9552d69@...kaller.appspotmail.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        linux-riscv <linux-riscv@...ts.infradead.org>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Benjamin Segall <bsegall@...gle.com>, dietmar.eggemann@....com,
        Juri Lelli <juri.lelli@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...e.de>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Vincent Guittot <vincent.guittot@...aro.org>
Subject: Re: [syzbot] BUG: unable to handle kernel access to user memory in schedule_tail

On Fri, Mar 12, 2021 at 6:34 PM Dmitry Vyukov <dvyukov@...gle.com> wrote:
>
> On Fri, Mar 12, 2021 at 5:36 PM Ben Dooks <ben.dooks@...ethink.co.uk> wrote:
> >
> > On 12/03/2021 16:34, Ben Dooks wrote:
> > > On 12/03/2021 16:30, Ben Dooks wrote:
> > >> On 12/03/2021 15:12, Dmitry Vyukov wrote:
> > >>> On Fri, Mar 12, 2021 at 2:50 PM Ben Dooks <ben.dooks@...ethink.co.uk>
> > >>> wrote:
> > >>>>
> > >>>> On 10/03/2021 17:16, Dmitry Vyukov wrote:
> > >>>>> On Wed, Mar 10, 2021 at 5:46 PM syzbot
> > >>>>> <syzbot+e74b94fe601ab9552d69@...kaller.appspotmail.com> wrote:
> > >>>>>>
> > >>>>>> Hello,
> > >>>>>>
> > >>>>>> syzbot found the following issue on:
> > >>>>>>
> > >>>>>> HEAD commit:    0d7588ab riscv: process: Fix no prototype for
> > >>>>>> arch_dup_tas..
> > >>>>>> git tree:
> > >>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
> > >>>>>> console output:
> > >>>>>> https://syzkaller.appspot.com/x/log.txt?x=1212c6e6d00000
> > >>>>>> kernel config:
> > >>>>>> https://syzkaller.appspot.com/x/.config?x=e3c595255fb2d136
> > >>>>>> dashboard link:
> > >>>>>> https://syzkaller.appspot.com/bug?extid=e74b94fe601ab9552d69
> > >>>>>> userspace arch: riscv64
> > >>>>>>
> > >>>>>> Unfortunately, I don't have any reproducer for this issue yet.
> > >>>>>>
> > >>>>>> IMPORTANT: if you fix the issue, please add the following tag to
> > >>>>>> the commit:
> > >>>>>> Reported-by: syzbot+e74b94fe601ab9552d69@...kaller.appspotmail.com
> > >>>>>
> > >>>>> +riscv maintainers
> > >>>>>
> > >>>>> This is riscv64-specific.
> > >>>>> I've seen similar crashes in put_user in other places. It looks like
> > >>>>> put_user crashes in the user address is not mapped/protected (?).
> > >>>>
> > >>>> I've been having a look, and this seems to be down to access of the
> > >>>> tsk->set_child_tid variable. I assume the fuzzing here is to pass a
> > >>>> bad address to clone?
> > >>>>
> > >>>>   From looking at the code, the put_user() code should have set the
> > >>>> relevant SR_SUM bit (the value for this, which is 1<<18 is in the
> > >>>> s2 register in the crash report) and from looking at the compiler
> > >>>> output from my gcc-10, the code looks to be dong the relevant csrs
> > >>>> and then csrc around the put_user
> > >>>>
> > >>>> So currently I do not understand how the above could have happened
> > >>>> over than something re-tried the code seqeunce and ended up retrying
> > >>>> the faulting instruction without the SR_SUM bit set.
> > >>>
> > >>> I would maybe blame qemu for randomly resetting SR_SUM, but it's
> > >>> strange that 99% of these crashes are in schedule_tail. If it would be
> > >>> qemu, then they would be more evenly distributed...
> > >>>
> > >>> Another observation: looking at a dozen of crash logs, in none of
> > >>> these cases fuzzer was actually trying to fuzz clone with some insane
> > >>> arguments. So it looks like completely normal clone's (e..g coming
> > >>> from pthread_create) result in this crash.
> > >>>
> > >>> I also wonder why there is ret_from_exception, is it normal? I see
> > >>> handle_exception disables SR_SUM:
> > >>> https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/riscv/kernel/entry.S#L73
> > >>>
> > >>
> > >> So I think if SR_SUM is set, then it faults the access to user memory
> > >> which the _user() routines clear to allow them access.
> > >>
> > >> I'm thinking there is at least one issue here:
> > >>
> > >> - the test in fault is the wrong way around for die kernel
> > >> - the handler only catches this if the page has yet to be mapped.
> > >>
> > >> So I think the test should be:
> > >>
> > >>          if (!user_mode(regs) && addr < TASK_SIZE &&
> > >>                          unlikely(regs->status & SR_SUM)
> > >>
> > >> This then should continue on and allow the rest of the handler to
> > >> complete mapping the page if it is not there.
> > >>
> > >> I have been trying to create a very simple clone test, but so far it
> > >> has yet to actually trigger anything.
> > >
> > > I should have added there doesn't seem to be a good way to use mmap()
> > > to allocate memory but not insert a vm-mapping post the mmap().
> > >
> > How difficult is it to try building a branch with the above test
> > modified?
>
> I don't have access to hardware, I don't have other qemu versions ready to use.
> But I can teach you how to run syzkaller locally :)
> I am not sure anybody run it on real riscv hardware at all. When
> Tobias ported syzkaller, Tobias also used qemu I think.
>
> I am now building with an inverted check to test locally.
>
> I don't fully understand but this code, but does handle_exception
> reset SR_SUM around do_page_fault? If so, then looking at SR_SUM in
> do_page_fault won't work with positive nor negative check.


The inverted check crashes during boot:

--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -249,7 +249,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
                flags |= FAULT_FLAG_USER;

        if (!user_mode(regs) && addr < TASK_SIZE &&
-                       unlikely(!(regs->status & SR_SUM)))
+                       unlikely(regs->status & SR_SUM))
                die_kernel_fault("access to user memory without
uaccess routines",
                                addr, regs);


[   77.349329][    T1] Run /sbin/init as init process
[   77.868371][    T1] Unable to handle kernel access to user memory
without uaccess routines at virtual address 00000000000e8e39
[   77.870355][    T1] Oops [#1]
[   77.870766][    T1] Modules linked in:
[   77.871326][    T1] CPU: 0 PID: 1 Comm: init Not tainted
5.12.0-rc2-00010-g0d7588ab9ef9-dirty #42
[   77.872057][    T1] Hardware name: riscv-virtio,qemu (DT)
[   77.872620][    T1] epc : __clear_user+0x36/0x4e
[   77.873285][    T1]  ra : padzero+0x9c/0xb0
[   77.873849][    T1] epc : ffffffe000bb7136 ra : ffffffe0004f42a0 sp
: ffffffe006f8fbc0
[   77.874438][    T1]  gp : ffffffe005d25718 tp : ffffffe006f98000 t0
: 00000000000e8e40
[   77.875031][    T1]  t1 : 00000000000e9000 t2 : 000000000001c49c s0
: ffffffe006f8fbf0
[   77.875618][    T1]  s1 : 00000000000001c7 a0 : 00000000000e8e39 a1
: 00000000000001c7
[   77.876204][    T1]  a2 : 0000000000000002 a3 : 00000000000e9000 a4
: ffffffe006f99000
[   77.876787][    T1]  a5 : 0000000000000000 a6 : 0000000000f00000 a7
: ffffffe00031c088
[   77.877367][    T1]  s2 : 00000000000e8e39 s3 : 0000000000001000 s4
: 0000003ffffffe39
[   77.877952][    T1]  s5 : 00000000000e8e39 s6 : 00000000000e9570 s7
: 00000000000e8e39
[   77.878535][    T1]  s8 : 0000000000000001 s9 : 00000000000e8e39
s10: ffffffe00c65f608
[   77.879126][    T1]  s11: ffffffe00816e8d8 t3 : ea3af0fa372b8300 t4
: 0000000000000003
[   77.879711][    T1]  t5 : ffffffc401dc45d8 t6 : 0000000000040000
[   77.880209][    T1] status: 0000000000040120 badaddr:
00000000000e8e39 cause: 000000000000000f
[   77.880846][    T1] Call Trace:
[   77.881213][    T1] [<ffffffe000bb7136>] __clear_user+0x36/0x4e
[   77.881912][    T1] [<ffffffe0004f523e>] load_elf_binary+0xf8a/0x2400
[   77.882562][    T1] [<ffffffe0003e1802>] bprm_execve+0x5b0/0x1080
[   77.883145][    T1] [<ffffffe0003e38bc>] kernel_execve+0x204/0x288
[   77.883727][    T1] [<ffffffe003b70e94>] run_init_process+0x1fe/0x212
[   77.884337][    T1] [<ffffffe003b70ec6>] try_to_run_init_process+0x1e/0x66
[   77.884956][    T1] [<ffffffe003bc0864>] kernel_init+0x14a/0x200
[   77.885541][    T1] [<ffffffe000005570>] ret_from_exception+0x0/0x14
[   77.886955][    T1] ---[ end trace 1e934d07b8a4bed8 ]---
[   77.887705][    T1] Kernel panic - not syncing: Fatal exception
[   77.888333][    T1] SMP: stopping secondary CPUs
[   77.889357][    T1] Rebooting in 86400 seconds..