lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZA1FIM2hG/ePMHYR@xpf.sh.intel.com>
Date:   Sun, 12 Mar 2023 11:21:04 +0800
From:   Pengfei Xu <pengfei.xu@...el.com>
To:     <dchinner@...hat.com>
CC:     <djwong@...nel.org>, <linux-kernel@...r.kernel.org>,
        <heng.su@...el.com>, <lkp@...el.com>
Subject: [Syzkaller & bisect] There is BUG: soft lockup in __cleanup_mnt in
 v6.3-rc1 kernel

Hi Dave Chinner and kernel expert,

Greeting!

Platform: x86 platforms

There is BUG: soft lockup in __cleanup_mnt in v6.3-rc1 kernel.
All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230311_022102___cleanup_mnt
Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/repro.c
v6.3-rc1 issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/fe15c26ee26efa11741a7b632e9f23b01aca4cc6_dmesg.log
Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/bisect_info.log

"
[   28.803497] XFS (loop0): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply+0x67d/0x690 (fs/xfs/xfs_buf.c:1553).  Shutting down filesystem.
[   28.804766] XFS (loop0): Please unmount the filesystem and rectify the problem(s)
[   56.372468] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [repro:523]
[   56.372847] Modules linked in:
[   56.372985] irq event stamp: 22822
[   56.373141] hardirqs last  enabled at (22821): [<ffffffff8300086e>] irqentry_exit+0x3e/0xa0
[   56.373512] hardirqs last disabled at (22822): [<ffffffff82ffe853>] sysvec_apic_timer_interrupt+0x13/0xe0
[   56.373923] softirqs last  enabled at (9032): [<ffffffff83023a3c>] __do_softirq+0x31c/0x49c
[   56.374276] softirqs last disabled at (8975): [<ffffffff8112b5a4>] irq_exit_rcu+0xc4/0x100
[   56.374623] CPU: 1 PID: 523 Comm: repro Not tainted 6.3.0-rc1-fe15c26ee26e+ #1
[   56.374924] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   56.375386] RIP: 0010:rcu_read_lock_sched_held+0x4b/0x80
[   56.375621] Code: 8d 7d f7 e8 c7 fe ff ff 84 c0 74 19 0f b6 45 f7 48 8b 4d f8 65 48 33 0c 25 28 00 00 00 75 38 c9 c3 cc cc cc cc be ff ff ff ff <48> c7 c7 20 3e f6 83 e8 89 ec de 01 89 c2 b8 01 00 00 00 85 d2 75
[   56.376376] RSP: 0018:ffffc9000102fbe0 EFLAGS: 00000246
[   56.376597] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[   56.376891] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000000
[   56.377192] RBP: ffffc9000102fbf0 R08: 0000000000000001 R09: 0000000000000000
[   56.377486] R10: ffff88800d01d418 R11: 0000000000000000 R12: ffff88800b0d1158
[   56.377780] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   56.378082] FS:  00007fc887d10740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[   56.378414] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   56.378655] CR2: 00007faaae519018 CR3: 000000000d3e2004 CR4: 0000000000770ee0
[   56.378950] PKRU: 55555554
[   56.379069] Call Trace:
[   56.379178]  <TASK>
[   56.379276]  lock_acquire+0x2ad/0x320
[   56.379444]  ? lock_release+0x141/0x2e0
[   56.379615]  ? write_comp_data+0x2f/0x90
[   56.379799]  _raw_spin_lock+0x36/0x50
[   56.379962]  ? xfs_ail_push_all_sync+0x73/0x110
[   56.380169]  xfs_ail_push_all_sync+0x73/0x110
[   56.380360]  ? __pfx_autoremove_wake_function+0x10/0x10
[   56.380591]  xfs_reclaim_inodes+0xb8/0x140
[   56.380779]  xfs_unmount_flush_inodes+0x63/0x80
[   56.380987]  xfs_unmountfs+0x69/0x1f0
[   56.381152]  xfs_fs_put_super+0x5a/0x120
[   56.381331]  ? __pfx_xfs_fs_put_super+0x10/0x10
[   56.381534]  generic_shutdown_super+0xac/0x240
[   56.381740]  kill_block_super+0x43/0x80
[   56.381917]  deactivate_locked_super+0x52/0xb0
[   56.382112]  deactivate_super+0xb3/0xd0
[   56.382282]  cleanup_mnt+0x15e/0x1e0
[   56.382451]  __cleanup_mnt+0x1f/0x30
[   56.382613]  task_work_run+0xb6/0x120
[   56.382783]  exit_to_user_mode_prepare+0x200/0x210
[   56.383000]  syscall_exit_to_user_mode+0x2d/0x60
[   56.383205]  do_syscall_64+0x4a/0x90
[   56.383368]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   56.383593] RIP: 0033:0x7fc887e3c7db
[   56.383751] Code: 96 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 96 0c 00 f7 d8 64 89 01 48
[   56.384500] RSP: 002b:00007ffc1a5bd1e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
[   56.384815] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fc887e3c7db
[   56.385114] RDX: 0000000000000000 RSI: 000000000000000a RDI: 00007ffc1a5bd290
[   56.385407] RBP: 00007ffc1a5be2d0 R08: 0000000001d0a333 R09: 0000000000000009
[   56.385700] R10: 0000000000404071 R11: 0000000000000206 R12: 00000000004012c0
[   56.385994] R13: 00007ffc1a5be410 R14: 0000000000000000 R15: 0000000000000000
[   56.386299]  </TASK>
[   56.386400] Kernel panic - not syncing: softlockup: hung tasks
[   56.386643] CPU: 1 PID: 523 Comm: repro Tainted: G             L     6.3.0-rc1-fe15c26ee26e+ #1
[   56.387000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   56.387461] Call Trace:
[   56.387569]  <IRQ>
[   56.387662]  dump_stack_lvl+0xa9/0x110
[   56.387834]  dump_stack+0x19/0x20
[   56.387980]  panic+0x4fd/0x520
[   56.388119]  ? watchdog_timer_fn+0x31e/0x390
[   56.388315]  watchdog_timer_fn+0x337/0x390
[   56.388497]  ? __pfx_watchdog_timer_fn+0x10/0x10
[   56.388710]  __hrtimer_run_queues+0x2ed/0x820
[   56.388924]  hrtimer_interrupt+0x134/0x2d0
[   56.389111]  __sysvec_apic_timer_interrupt+0x9c/0x290
[   56.389331]  sysvec_apic_timer_interrupt+0xab/0xe0
[   56.389535]  </IRQ>
[   56.389628]  <TASK>
[   56.389722]  asm_sysvec_apic_timer_interrupt+0x1f/0x30
[   56.389940] RIP: 0010:rcu_read_lock_sched_held+0x4b/0x80
[   56.390166] Code: 8d 7d f7 e8 c7 fe ff ff 84 c0 74 19 0f b6 45 f7 48 8b 4d f8 65 48 33 0c 25 28 00 00 00 75 38 c9 c3 cc cc cc cc be ff ff ff ff <48> c7 c7 20 3e f6 83 e8 89 ec de 01 89 c2 b8 01 00 00 00 85 d2 75
[   56.390909] RSP: 0018:ffffc9000102fbe0 EFLAGS: 00000246
[   56.391129] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[   56.391422] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000000
[   56.391715] RBP: ffffc9000102fbf0 R08: 0000000000000001 R09: 0000000000000000
[   56.392009] R10: ffff88800d01d418 R11: 0000000000000000 R12: ffff88800b0d1158
[   56.392303] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   56.392607]  lock_acquire+0x2ad/0x320
[   56.392773]  ? lock_release+0x141/0x2e0
[   56.392941]  ? write_comp_data+0x2f/0x90
[   56.393122]  _raw_spin_lock+0x36/0x50
[   56.393285]  ? xfs_ail_push_all_sync+0x73/0x110
[   56.393483]  xfs_ail_push_all_sync+0x73/0x110
[   56.393672]  ? __pfx_autoremove_wake_function+0x10/0x10
[   56.393900]  xfs_reclaim_inodes+0xb8/0x140
[   56.394087]  xfs_unmount_flush_inodes+0x63/0x80
[   56.394285]  xfs_unmountfs+0x69/0x1f0
[   56.394450]  xfs_fs_put_super+0x5a/0x120
[   56.394627]  ? __pfx_xfs_fs_put_super+0x10/0x10
[   56.394829]  generic_shutdown_super+0xac/0x240
[   56.395030]  kill_block_super+0x43/0x80
[   56.395207]  deactivate_locked_super+0x52/0xb0
[   56.395403]  deactivate_super+0xb3/0xd0
[   56.395574]  cleanup_mnt+0x15e/0x1e0
[   56.395739]  __cleanup_mnt+0x1f/0x30
[   56.395902]  task_work_run+0xb6/0x120
[   56.396071]  exit_to_user_mode_prepare+0x200/0x210
[   56.396282]  syscall_exit_to_user_mode+0x2d/0x60
[   56.396487]  do_syscall_64+0x4a/0x90
[   56.396648]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   56.396865] RIP: 0033:0x7fc887e3c7db
[   56.397020] Code: 96 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 96 0c 00 f7 d8 64 89 01 48
[   56.397761] RSP: 002b:00007ffc1a5bd1e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
[   56.398072] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fc887e3c7db
[   56.398365] RDX: 0000000000000000 RSI: 000000000000000a RDI: 00007ffc1a5bd290
[   56.398658] RBP: 00007ffc1a5be2d0 R08: 0000000001d0a333 R09: 0000000000000009
[   56.398951] R10: 0000000000404071 R11: 0000000000000206 R12: 00000000004012c0
[   56.399244] R13: 00007ffc1a5be410 R14: 0000000000000000 R15: 0000000000000000
[   56.399546]  </TASK>
[   56.399792] Kernel Offset: disabled
[   56.399972] ---[ end Kernel panic - not syncing: softlockup: hung tasks ]---
"

I used the key word "__cleanup_mnt" to bisect, I knew there was other BUG
info in dmesg with "bisect good" commit log.

Anyway, just provided the bad commit and hope it's helpful:
"
bd4f5d09cc93c8ca51e4efea86ac90a4bb553d6e
xfs: refactor the filestreams allocator pick functions
"
It's just a suspected commit, reverted above commit on top of v6.3-rc1 and
no "__cleanup_mnt" related BUG but generated the other BUG info in dmesg.

There was no this issue report yet in syzbot:
https://syzkaller.appspot.com/upstream

I tried that, installed the xfs fixed patch for below issue on top of v6.3-rc1:
https://lore.kernel.org/lkml/20230226223403.GU360264@dread.disaster.area/
"__cleanup_mnt" BUG issue still existed.
Related dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/v6.3-rc1_xfs_patch_another_cleanup_mnt_issue.log
So it should be different than previous xfs related issue.

---

If you don't need the following environment to reproduce the problem or if you
already have one, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
   // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
   // You could change the bzImage_xxx as you want
You could use below command to log in, there is no password for root.
ssh -p 10023 root@...alhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@...alhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl
make
make install

Thanks!
BR.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ