lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZCEy5jA2nT/vaO56@xpf.sh.intel.com>
Date:   Mon, 27 Mar 2023 14:08:38 +0800
From:   Pengfei Xu <pengfei.xu@...el.com>
To:     <axboe@...nel.dk>
CC:     <stable@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <heng.su@...el.com>, <lkp@...el.com>
Subject: [Syzkaller & bisect] There is "io_poll_remove_entries" NULL pointer
 BUG in v6.3-rc4 kernel

Hi Jens Axboe and kernel experts,

Platform: x86 platforms
There is "io_poll_remove_entries" NULL pointer BUG in v6.3-rc4 kernel.

All detailed log: https://github.com/xupengfe/syzkaller_logs/tree/main/230327_041425_io_poll_remove_entries
Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230327_041425_io_poll_remove_entries/repro.c
Syzkaller analysis report0: https://github.com/xupengfe/syzkaller_logs/blob/main/230327_041425_io_poll_remove_entries/report0
Syzkaller analysis status: https://github.com/xupengfe/syzkaller_logs/blob/main/230327_041425_io_poll_remove_entries/repro.stats
v6.3-rc4 issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230327_041425_io_poll_remove_entries/v63rc4_reproduce_dmesg.log
Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230327_041425_io_poll_remove_entries/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230327_041425_io_poll_remove_entries/bisect_info.log

It could be reproduced in v6.3-rc3 and v6.3-rc4 kernel, and bisected between
v6.3-rc3 and v5.11 kernel, bad commit was:
"
c16bda37594f83147b167d381d54c010024efecf
io_uring/poll: allow some retries for poll triggering spuriously
"
After reverted above commit on top of v6.3-rc3 kernel, this issue was gone.

"
[   67.196537] BUG: kernel NULL pointer dereference, address: 0000000000000008
[   67.197176] #PF: supervisor read access in kernel mode
[   67.197638] #PF: error_code(0x0000) - not-present page
[   67.198100] PGD 1204e067 P4D 1204e067 PUD 12211067 PMD 0 
[   67.198588] Oops: 0000 [#1] PREEMPT SMP NOPTI
[   67.198987] CPU: 1 PID: 393 Comm: repro Not tainted 6.3.0-rc4-kvmmainline #2
[   67.199615] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[   67.200324] RIP: 0010:io_poll_remove_entries.part.16+0x14a/0x310
[   67.200871] Code: 06 00 00 00 89 de e8 75 2f 59 ff 80 fb 06 0f 84 c5 01 00 00 e8 07 2e 59 ff 49 8b 84 24 a8 00 00 00 48 8b 58 40 e8 f6 2d 59 ff <4c> 8b 63 08 4d 85 e4 74 38 e8 e8 2d 59 ff 4c 89 e7 e8 a0 1f 27 01
[   67.202490] RSP: 0018:ffffc900010b3b78 EFLAGS: 00010246
[   67.202963] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81d4356b
[   67.203597] RDX: 0000000000000000 RSI: ffff888004ecc680 RDI: 0000000000000002
[   67.204230] RBP: ffffc900010b3b90 R08: 000000000000001c R09: 0000000000000001
[   67.204867] R10: ffff888004ecd418 R11: 0000000000000000 R12: ffff8880044d6e00
[   67.205502] R13: 0000000000000000 R14: 0000000000000059 R15: 0000000001000000
[   67.206137] FS:  00007fd010a36740(0000) GS:ffff88807ed00000(0000) knlGS:0000000000000000
[   67.206850] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   67.207383] CR2: 0000000000000008 CR3: 0000000005efe004 CR4: 0000000000770ee0
[   67.208037] PKRU: 55555554
[   67.208298] Call Trace:
[   67.208535]  <TASK>
[   67.208746]  __io_arm_poll_handler+0x30c/0x520
[   67.209170]  io_arm_poll_handler+0x2cf/0x530
[   67.209577]  ? __pfx_io_async_queue_proc+0x10/0x10
[   67.210030]  io_queue_async+0x88/0x2f0
[   67.210394]  io_req_task_submit+0x122/0x140
[   67.210794]  io_poll_task_func+0x4c6/0x700
[   67.211182]  tctx_task_work+0x1d3/0x650
[   67.211554]  task_work_run+0xb6/0x120
[   67.211910]  get_signal+0xd6b/0x14a0
[   67.212259]  ? __tty_hangup.part.29+0x32a/0x3a0
[   67.212709]  arch_do_signal_or_restart+0x33/0x280
[   67.213159]  ? exit_to_user_mode_prepare+0x100/0x210
[   67.213631]  ? __this_cpu_preempt_check+0x20/0x30
[   67.214083]  exit_to_user_mode_prepare+0x13b/0x210
[   67.214540]  syscall_exit_to_user_mode+0x2d/0x60
[   67.214981]  do_syscall_64+0x4a/0x90
[   67.215335]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   67.215818] RIP: 0033:0x7fd010b5b59d
[   67.216163] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d3 08 0d 00 f7 d8 64 89 01 48
[   67.217852] RSP: 002b:00007ffc0d005148 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[   67.218553] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fd010b5b59d
[   67.219214] RDX: 0000000000000000 RSI: 0000000000005437 RDI: 0000000000000003
[   67.219884] RBP: 00007ffc0d005160 R08: 0000000000000000 R09: 00007ffc0d005240
[   67.220551] R10: 0000000000000000 R11: 0000000000000206 R12: 00000000004010b0
[   67.221219] R13: 00007ffc0d005240 R14: 0000000000000000 R15: 0000000000000000
[   67.221891]  </TASK>
[   67.222112] Modules linked in:
[   67.222414] CR2: 0000000000000008
[   67.222738] ---[ end trace 0000000000000000 ]---
[   67.261809] RIP: 0010:io_poll_remove_entries.part.16+0x14a/0x310
[   67.262406] Code: 06 00 00 00 89 de e8 75 2f 59 ff 80 fb 06 0f 84 c5 01 00 00 e8 07 2e 59 ff 49 8b 84 24 a8 00 00 00 48 8b 58 40 e8 f6 2d 59 ff <4c> 8b 63 08 4d 85 e4 74 38 e8 e8 2d 59 ff 4c 89 e7 e8 a0 1f 27 01
[   67.264185] RSP: 0018:ffffc900010b3b78 EFLAGS: 00010246
[   67.264704] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81d4356b
[   67.265399] RDX: 0000000000000000 RSI: ffff888004ecc680 RDI: 0000000000000002
[   67.266094] RBP: ffffc900010b3b90 R08: 000000000000001c R09: 0000000000000001
[   67.266790] R10: ffff888004ecd418 R11: 0000000000000000 R12: ffff8880044d6e00
[   67.267484] R13: 0000000000000000 R14: 0000000000000059 R15: 0000000001000000
[   67.268179] FS:  00007fd010a36740(0000) GS:ffff88807ed00000(0000) knlGS:0000000000000000
[   67.268964] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   67.269530] CR2: 0000000000000008 CR3: 0000000005efe004 CR4: 0000000000770ee0
[   67.270228] PKRU: 55555554
[   67.270506] note: repro[393] exited with irqs disabled
[   67.271058] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
[   67.271874] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 393, name: repro
[   67.272623] preempt_count: 0, expected: 0
[   67.273025] RCU nest depth: 1, expected: 0
[   67.273431] INFO: lockdep is turned off.
[   67.273858] CPU: 1 PID: 393 Comm: repro Tainted: G      D            6.3.0-rc4-kvmmainline #2
[   67.274688] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[   67.275468] Call Trace:
[   67.275721]  <TASK>
[   67.275950]  dump_stack_lvl+0xe0/0x110
[   67.276344]  dump_stack+0x19/0x20
[   67.276702]  __might_resched+0x1c2/0x2e0
[   67.277111]  __might_sleep+0x46/0x70
[   67.277486]  __mutex_lock+0x54/0xcb0
[   67.277862]  ? debug_smp_processor_id+0x20/0x30
[   67.278328]  ? rcu_is_watching+0x15/0x50
[   67.278737]  ? io_uring_del_tctx_node+0x84/0x110
[   67.279210]  ? lock_release+0x1f4/0x290
[   67.279611]  mutex_lock_nested+0x1f/0x30
[   67.280016]  ? mutex_lock_nested+0x1f/0x30
[   67.280435]  io_uring_del_tctx_node+0x84/0x110
[   67.280906]  io_uring_clean_tctx+0x74/0xf0
[   67.281328]  io_uring_cancel_generic+0x452/0x4e0
[   67.281814]  ? __pfx_autoremove_wake_function+0x10/0x10
[   67.282356]  __io_uring_cancel+0x1f/0x30
[   67.282774]  do_exit+0x227/0x12b0
[   67.283129]  ? write_comp_data+0x2f/0x90
[   67.283549]  make_task_dead+0x100/0x290
[   67.283951]  rewind_stack_and_make_dead+0x17/0x20
[   67.284446] RIP: 0033:0x7fd010b5b59d
[   67.284823] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d3 08 0d 00 f7 d8 64 89 01 48
[   67.286656] RSP: 002b:00007ffc0d005148 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[   67.287419] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fd010b5b59d
[   67.288140] RDX: 0000000000000000 RSI: 0000000000005437 RDI: 0000000000000003
[   67.288862] RBP: 00007ffc0d005160 R08: 0000000000000000 R09: 00007ffc0d005240
[   67.289582] R10: 0000000000000000 R11: 0000000000000206 R12: 00000000004010b0
[   67.290298] R13: 00007ffc0d005240 R14: 0000000000000000 R15: 0000000000000000
[   67.291020]  </TASK>
[   67.364809] ------------[ cut here ]------------
[   67.365494] Voluntary context switch within RCU read-side critical section!
[   67.365535] WARNING: CPU: 1 PID: 393 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x4f0/0x5b0
[   67.367522] Modules linked in:
[   67.367898] CPU: 1 PID: 393 Comm: repro Tainted: G      D W          6.3.0-rc4-kvmmainline #2
[   67.368885] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[   67.369741] RIP: 0010:rcu_note_context_switch+0x4f0/0x5b0
[   67.370360] Code: 9c d7 01 85 c0 75 02 0f 0b 49 8b 4c 24 58 48 8b 53 20 e9 6f fd ff ff 48 c7 c7 a0 f8 99 83 c6 05 a4 6c dd 02 01 e8 a0 bf ef ff <0f> 0b e9 a4 fb ff ff a9 ff ff ff 7f 0f 84 0c fc ff ff 65 48 8b 3c
[   67.372260] RSP: 0018:ffffc900010b3bf8 EFLAGS: 00010082
[   67.372800] RAX: 0000000000000000 RBX: ffff88807ed36180 RCX: ffffffff8111743b
[   67.373524] RDX: 0000000000000000 RSI: ffff888004ecc680 RDI: 0000000000000002
[   67.374249] RBP: ffffc900010b3c20 R08: 0000000000000000 R09: 0000000000000000
[   67.374972] R10: 000000002d2d2d2d R11: 000000002d2d2d2d R12: ffff88807ed352c0
[   67.375698] R13: ffff888004ecc680 R14: 0000000000000000 R15: ffff888004ecc680
[   67.376425] FS:  00007fd010a36740(0000) GS:ffff88807ed00000(0000) knlGS:0000000000000000
[   67.377244] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   67.377836] CR2: 0000000000000008 CR3: 0000000005efe004 CR4: 0000000000770ee0
[   67.378565] PKRU: 55555554
[   67.378854] Call Trace:
[   67.379117]  <TASK>
[   67.379350]  __schedule+0xe1/0xc30
[   67.379725]  schedule+0x5b/0xe0
[   67.380072]  schedule_preempt_disabled+0x1c/0x30
[   67.380563]  __mutex_lock+0xad4/0xcb0
[   67.380959]  ? debug_smp_processor_id+0x20/0x30
[   67.381444]  ? rcu_is_watching+0x15/0x50
[   67.381869]  ? io_uring_del_tctx_node+0x84/0x110
[   67.382362]  ? lock_release+0x1f4/0x290
[   67.382780]  mutex_lock_nested+0x1f/0x30
[   67.383201]  ? mutex_lock_nested+0x1f/0x30
[   67.383638]  io_uring_del_tctx_node+0x84/0x110
[   67.384114]  io_uring_clean_tctx+0x74/0xf0
[   67.384551]  io_uring_cancel_generic+0x452/0x4e0
[   67.385057]  ? __pfx_autoremove_wake_function+0x10/0x10
[   67.385612]  __io_uring_cancel+0x1f/0x30
[   67.386039]  do_exit+0x227/0x12b0
[   67.386402]  ? write_comp_data+0x2f/0x90
[   67.386830]  make_task_dead+0x100/0x290
[   67.387243]  rewind_stack_and_make_dead+0x17/0x20
[   67.387750] RIP: 0033:0x7fd010b5b59d
[   67.388137] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d3 08 0d 00 f7 d8 64 89 01 48
[   67.390025] RSP: 002b:00007ffc0d005148 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[   67.390805] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fd010b5b59d
[   67.391541] RDX: 0000000000000000 RSI: 0000000000005437 RDI: 0000000000000003
[   67.392279] RBP: 00007ffc0d005160 R08: 0000000000000000 R09: 00007ffc0d005240
[   67.393021] R10: 0000000000000000 R11: 0000000000000206 R12: 00000000004010b0
[   67.393758] R13: 00007ffc0d005240 R14: 0000000000000000 R15: 0000000000000000
[   67.394499]  </TASK>
[   67.394741] irq event stamp: 4450
[   67.395097] hardirqs last  enabled at (4449): [<ffffffff82fb58db>] _raw_spin_unlock_irq+0x2b/0x60
[   67.396014] hardirqs last disabled at (4450): [<ffffffff82f9476e>] exc_page_fault+0x4e/0x3b0
[   67.396897] softirqs last  enabled at (1580): [<ffffffff8107aa91>] fpu_flush_thread+0xd1/0x130
[   67.397787] softirqs last disabled at (1578): [<ffffffff8107a9c9>] fpu_flush_thread+0x9/0x130
[   67.398671] ---[ end trace 0000000000000000 ]---
"

I hope it's helpful.

---

If you don't need the following environment to reproduce the problem or if you
already have one, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
   // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
   // You could change the bzImage_xxx as you want
You could use below command to log in, there is no password for root.
ssh -p 10023 root@...alhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@...alhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.

Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl
make
make install

Thanks!
BR.

Powered by blists - more mailing lists