linux-kernel - RE: BUG: unable to handle kernel NULL pointer dereference in rcu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <IA1PR11MB6171E006D288555B6223FBF789AC9@IA1PR11MB6171.namprd11.prod.outlook.com>
Date:   Tue, 28 Feb 2023 08:55:26 +0000
From:   "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>
To:     Joel Fernandes <joel@...lfernandes.org>,
        Zhouyi Zhou <zhouzhouyi@...il.com>
CC:     Sanan Hasanov <sanan.hasanov@...ghts.ucf.edu>,
        "paulmck@...nel.org" <paulmck@...nel.org>,
        "frederic@...nel.org" <frederic@...nel.org>,
        "quic_neeraju@...cinc.com" <quic_neeraju@...cinc.com>,
        "josh@...htriplett.org" <josh@...htriplett.org>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "mathieu.desnoyers@...icios.com" <mathieu.desnoyers@...icios.com>,
        "jiangshanlai@...il.com" <jiangshanlai@...il.com>,
        "rcu@...r.kernel.org" <rcu@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "syzkaller@...glegroups.com" <syzkaller@...glegroups.com>,
        "contact@...zz.com" <contact@...zz.com>
Subject: RE: BUG: unable to handle kernel NULL pointer dereference in rcu_core

> From: Joel Fernandes <joel@...lfernandes.org>
> Sent: Monday, February 27, 2023 9:15 PM
> To: Zhouyi Zhou <zhouzhouyi@...il.com>
> Cc: Sanan Hasanov <sanan.hasanov@...ghts.ucf.edu>; paulmck@...nel.org;
> frederic@...nel.org; quic_neeraju@...cinc.com; josh@...htriplett.org;
> rostedt@...dmis.org; mathieu.desnoyers@...icios.com;
> jiangshanlai@...il.com; rcu@...r.kernel.org; linux-kernel@...r.kernel.org;
> syzkaller@...glegroups.com; contact@...zz.com
> Subject: Re: BUG: unable to handle kernel NULL pointer dereference in
> rcu_core
> 
> ...
> >> BUG: kernel NULL pointer dereference, address: 0000000000000000
> >> #PF: supervisor instruction fetch in kernel mode
> >> #PF: error_code(0x0010) - not-present page PGD 53756067 P4D 53756067
> >> PUD 0
> >> Oops: 0010 [#1] PREEMPT SMP KASAN
> >> CPU: 7 PID: 0 Comm: swapper/7 Not tainted 6.2.0-next-20230221 #1
> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1
> >> 04/01/2014
> >> RIP: 0010:0x0
> >> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> >> RSP: 0018:ffffc900003f8e48 EFLAGS: 00010246
> >> RAX: 0000000000000000 RBX: ffff888100833900 RCX: 00000000b9582f6c
> >> RDX: 1ffff11020106853 RSI: ffffffff816b2769 RDI: ffff888043f64708
> >> RBP: 000000000000000c R08: 0000000000000000 R09: ffffffff900b895f
> >> R10: fffffbfff201712b R11: 000000000008e001 R12: dffffc0000000000
> >> R13: ffffc900003f8ec8 R14: ffff888043f64708 R15: 000000000000000b
> >> FS:  0000000000000000(0000) GS:ffff888119f80000(0000)
> >> knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: ffffffffffffffd6 CR3: 0000000054e64000 CR4: 0000000000350ee0
> >> Call Trace:
> >> <IRQ>
> >> rcu_core+0x85d/0x1960
> >> __do_softirq+0x2e5/0xae2
> >> __irq_exit_rcu+0x11d/0x190
> >> irq_exit_rcu+0x9/0x20
> >> sysvec_apic_timer_interrupt+0x97/0xc0
> >> </IRQ>
> >> <TASK>
> >> asm_sysvec_apic_timer_interrupt+0x1a/0x20
> >> RIP: 0010:default_idle+0xf/0x20
> >> Code: 89 07 49 c7 c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 76 ff
> >> ff ff cc cc cc cc f3 0f 1e fa eb 07 0f 00 2d e3 8a 34 00 fb f4 <fa>
> >> c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 65
> >> RSP: 0018:ffffc9000017fe00 EFLAGS: 00000202
> >> RAX: 0000000000dfbea1 RBX: dffffc0000000000 RCX: ffffffff89b1da9c
> >> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
> >> RBP: 0000000000000007 R08: 0000000000000001 R09: ffff888119fb6c23
> >> R10: ffffed10233f6d84 R11: dffffc0000000000 R12: 0000000000000003
> >> R13: ffff888100833900 R14: ffffffff8e112850 R15: 0000000000000000
> >> default_idle_call+0x67/0xa0
> >> do_idle+0x361/0x440
> >> cpu_startup_entry+0x18/0x20
> >> start_secondary+0x256/0x300
> >> secondary_startup_64_no_verify+0xce/0xdb
> >> </TASK>
> >> Modules linked in:
> >> CR2: 0000000000000000
> >> ---[ end trace 0000000000000000 ]---
> >> RIP: 0010:0x0
> >> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> 
> I have seen this exact signature when the processor tries to execute a
> function that has a NULL address. That causes IP to goto 0 and the exception.
> Sounds like something corrupted rcu_head (Just a guess).

Did a quick test to directly invoke "call_rcu(head, NULL)", then the kernel got panic 
with almost the same call trace as above and with the same RIP:

       RIP: 0010:0x0
       Code: Unable to access opcode bytes at 0xffffffffffffffd6.

If invoke " call_rcu(head, NULL + 1)", then

       RIP: 0010:0x1
       Code: Unable to access opcode bytes at 0xffffffffffffffd7.

If invoke " call_rcu(head, NULL + 2)", then

       RIP: 0010:0x2
       Code: Unable to access opcode bytes at 0xffffffffffffffd8.

The log above tends to say your guess (a corrupted rcu_head) is reasonable. 😊

-Qiuxu