linux-kernel - RE: [Syzkaller & bisect] There is "task hung in synchronize

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <PH0PR11MB588010215E1ACBDD1F9EF6B3DA989@PH0PR11MB5880.namprd11.prod.outlook.com>
Date:   Thu, 13 Apr 2023 11:51:28 +0000
From:   "Zhang, Qiang1" <qiang1.zhang@...el.com>
To:     Frederic Weisbecker <frederic@...nel.org>,
        "Xu, Pengfei" <pengfei.xu@...el.com>, Jens Axboe <axboe@...nel.dk>
CC:     "lihuafei1@...wei.com" <lihuafei1@...wei.com>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        lkp <lkp@...el.com>,
        "quic_neeraju@...cinc.com" <quic_neeraju@...cinc.com>,
        "paulmck@...nel.org" <paulmck@...nel.org>,
        "Su, Heng" <heng.su@...el.com>
Subject: RE: [Syzkaller & bisect] There is "task hung in synchronize_rcu
 bisect" in v6.3-rc2 kernel

> Hi Huafei and kernel experts,
> 
> It's a soft remind.
> This issue could be reproduced in v6.3-rc6 kernel.
> It could be reproduced on Alder lake, Raptor lake and so on x86 platforms.
> After reverted the commit "0e792b89e6800c:ftrace: Fix use-after-free for
> dynamic ftrace_ops" on top of v6.3-rc6 kernel, this issue was gone.
> 
> New syzkaller reproduced code, repro.report, bisect_info.log and detailed logs
> are in link:
> https://github.com/xupengfe/syzkaller_logs/tree/main/230412_031722_synchronize_rcu
>
>I just tested against v6.3-rc6 and again all I get is this io_ring related
>issue:
>

I also did not reproduce this synchronize_rcu_tasks() hung problem, but from this link:
https://github.com/xupengfe/syzkaller_logs/blob/main/230412_031722_synchronize_rcu/02bf43c7b7f7a19aa59a75f5244f0a3408bace1a_dmesg.log

I find rcu-task stall

INFO: rcu_tasks detected stalls on tasks:
[ 1243.720952] 000000008aaceb95: .. nvcsw: 1/1 holdout: 1 idle_cpu: -1/0
[ 1243.721915] task:repro           state:R  running task     stack:0     pid:592   ppid:585    flags:0x00004004
[ 1243.723486] Call Trace:
[ 1243.723996]  <TASK>
[ 1243.725204]  __schedule+0x385/0xb00
[ 1243.726212]  ? __cond_resched+0x25/0x30
[ 1243.727108]  preempt_schedule_common+0x11/0x90
[ 1243.727964]  __cond_resched+0x25/0x30
[ 1243.729076]  mousedev_write+0xd2/0x3d0
[ 1243.730168]  vfs_write+0x1b6/0x780
[ 1243.730873]  ? mousedev_fasync+0x40/0x40
[ 1243.731864]  ? vfs_write+0x9/0x780
[ 1243.732874]  ? write_comp_data+0x2f/0x90
[ 1243.733926]  ksys_write+0x153/0x170
[ 1243.734929]  __x64_sys_write+0x27/0x30
[ 1243.735718]  do_syscall_64+0x3b/0x90
[ 1243.736760]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 1243.737586] RIP: 0033:0x7f374119759d

Thanks
Zqiang


>
>[  448.290752] INFO: task kworker/u4:0:9 blocked for more than 294 seconds.
>[  448.293868]       Not tainted 6.3.0-rc6-kvm #1
>[  448.296019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>[  448.299592] task:kworker/u4:0    state:D stack:0     pid:9     ppid:2      flags:0x00004000
>[  448.303397] Workqueue: events_unbound io_ring_exit_work
>[  448.305884] Call Trace:
>[  448.307147]  <TASK>
>[  448.308166]  __schedule+0x422/0xc90
>[  448.309824]  ? wait_for_completion+0x77/0x170
>[  448.311870]  schedule+0x63/0xd0
>[  448.313346]  schedule_timeout+0x2fe/0x4c0
>[  448.315255]  ? __this_cpu_preempt_check+0x1c/0x30
>[  448.317360]  ? _raw_spin_unlock_irq+0x27/0x60
>[  448.319400]  ? lockdep_hardirqs_on+0x88/0x120
>[  448.321395]  ? wait_for_completion+0x77/0x170
>[  448.323462]  wait_for_completion+0x9e/0x170
>[  448.325356]  io_ring_exit_work+0x2b0/0x810
>[  448.327300]  ? __pfx_io_tctx_exit_cb+0x10/0x10
>[  448.329345]  ? _raw_spin_unlock_irq+0x27/0x60
>[  448.331397]  process_one_work+0x34e/0x720
>[  448.333212]  ? __pfx_io_ring_exit_work+0x10/0x10
>[  448.335377]  ? process_one_work+0x34e/0x720
>[  448.337295]  worker_thread+0x4e/0x530
>[  448.339079]  ? __pfx_worker_thread+0x10/0x10
>[  448.341008]  kthread+0x128/0x160
>[  448.342513]  ? __pfx_kthread+0x10/0x10
>[  448.344305]  ret_from_fork+0x2c/0x50
>[  448.346016]  </TASK>
>[  448.347176] 
>[  448.347176] Showing all locks held in the system:
>[  448.349887] 2 locks held by kworker/u4:0/9:
>[  448.351829]  #0: ffff88807eb6dd38 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x2b4/0x720
>[  448.356362]  #1: ffffc9000005fe68 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x2b4/0x720
>[  448.361052] 1 lock held by rcu_tasks_kthre/11:
>[  448.363142]  #0: ffffffff83963450 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x440
>[  448.367396] 1 lock held by rcu_tasks_rude_/12:
>[  448.369387]  #0: ffffffff839631d0 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x440
>[  448.373808] 1 lock held by rcu_tasks_trace/13:
>[  448.375852]  #0: ffffffff83962f10 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x440
>[  448.380434] 1 lock held by khungtaskd/30:
>[  448.382219]  #0: ffffffff83963e60 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x17/0x1d0
>[  448.386291] 
>[  448.387159] =============================================
>[  448.387159]