lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a8a70099-e0e2-4e35-b1b0-d0117bcbfc52@candelatech.com>
Date: Mon, 6 May 2024 21:20:48 -0700
From: Ben Greear <greearb@...delatech.com>
To: Heiner Kallweit <heiner.kallweit@....de>,
 LKML <linux-kernel@...r.kernel.org>, linux-leds@...r.kernel.org,
 Lee Jones <lee@...nel.org>
Cc: Johannes Berg <johannes@...solutions.net>
Subject: Re: 6.9.0-rc2+ kernel hangs on boot (bisected, maybe LED related)

On 5/6/24 13:00, Heiner Kallweit wrote:
> On 03.04.2024 21:35, Ben Greear wrote:
>> On 4/2/24 10:38, Ben Greear wrote:
>>> On 4/2/24 09:37, Ben Greear wrote:
>>>> Hello,
>>>>
>>>> Sometime between rc1 and today's rc2, my system quit booting.
>>>> I'm not seeing any splats, it just stops.  Evidently before
>>>> sysrq is enabled.
>>>>
> 
> For my understanding:
> You say 6.9-rc1 was ok, but 6.9-rc2 is not?
> 
> If I look at the diff then I see no LED subsystem changes,
> but iwlwifi changes. It's not clear to me why your bisect
> points to something outside the diff.

I was incorrect in my early assessment about exactly where
the error came in.  I later ran a full bisect to find the commit
that showed the error.  The problem only seems to happen when there
are lots of iwlwifi (in my case) radios in a system, so that added
to my initial confusion on the bug.

It is almost certainly LED related, as my initial hack to make the problem
go away was to just comment out the led registration logic in iwlwifi.

Johanne's solution also makes the problem go away.

Thanks,
Ben

> 
> 
>>>> [  OK  ] Started Flush Journal to Persistent Storage.
>>>> [  OK  ] Started udev Coldplug all Devices.
>>>>            Starting udev Wait for Complete Device Initialization...
>>>> [  OK  ] Listening on Load/Save RF …itch Status /dev/rfkill Watch.
>>>> [  OK  ] Created slice system-lvm2\x2dpvscan.slice.
>>>>            Starting LVM2 PV scan on device 8:19...
>>>>            Starting LVM2 PV scan on device 8:3...
>>>> [  OK  ] Started Device-mapper event daemon.
>>>> iwlwifi 0000:04:00.0: WRT: Invalid buffer destination: 0
>>>> sysrq: This sysrq operation is disabled.
>>>>
>>>> I can start a bisect, but in case anyone knows the answer already, please let me know.
>>>>
>>>> Thanks,
>>>> Ben
>>>>
>>>
>>> So, deadlock I guess....
>>>
>>>    INFO: task kworker/5:13:648 blocked for more than 180 seconds.
>>>         Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:kworker/5:13    state:D stack:0     pid:648   tgid:648   ppid:2      flags:0x00004000
>>> Workqueue: events deferred_probe_timeout_work_func
>>> Call Trace:
>>>    <TASK>
>>>    __schedule+0x43d/0xe20
>>>    schedule+0x31/0x130
>>>    schedule_timeout+0x1b9/0x1d0
>>>    ? mark_held_locks+0x49/0x70
>>>    ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>    __wait_for_common+0xb9/0x1d0
>>>    ? usleep_range_state+0xb0/0xb0
>>>    ? __flush_work+0x1ff/0x460
>>>    __flush_work+0x287/0x460
>>>    ? flush_workqueue_prep_pwqs+0x120/0x120
>>>    deferred_probe_timeout_work_func+0x2b/0xa0
>>>    process_one_work+0x212/0x710
>>>    ? lock_is_held_type+0xa5/0x110
>>>    worker_thread+0x188/0x340
>>>    ? rescuer_thread+0x380/0x380
>>>    kthread+0xd7/0x110
>>>    ? kthread_complete_and_exit+0x20/0x20
>>>    ret_from_fork+0x28/0x40
>>>    ? kthread_complete_and_exit+0x20/0x20
>>>    ret_from_fork_asm+0x11/0x20
>>>    </TASK>
>>> INFO: task udevadm:763 blocked for more than 180 seconds.
>>>         Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:udevadm         state:D stack:0     pid:763   tgid:763   ppid:1      flags:0x00000000
>>> Call Trace:
>>>    <TASK>
>>>    __schedule+0x43d/0xe20
>>>    schedule+0x31/0x130
>>>    schedule_timeout+0x1b9/0x1d0
>>>    ? __wait_for_common+0xb0/0x1d0
>>>    ? lock_release+0xc6/0x290
>>>    ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>    __wait_for_common+0xb9/0x1d0
>>>    ? usleep_range_state+0xb0/0xb0
>>>    ? __flush_work+0x1ff/0x460
>>>    __flush_work+0x287/0x460
>>>    ? flush_workqueue_prep_pwqs+0x120/0x120
>>>    fsnotify_destroy_group+0x66/0xf0
>>>    inotify_release+0x12/0x40
>>>    __fput+0xa6/0x2d0
>>>    __x64_sys_close+0x33/0x70
>>>    do_syscall_64+0x6c/0x170
>>>    entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f744d5bc878
>>> RSP: 002b:00007ffcef12f8d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
>>> RAX: ffffffffffffffda RBX: 00007f744cd048c0 RCX: 00007f744d5bc878
>>> RDX: ffffffffffffff80 RSI: 0000000000000000 RDI: 0000000000000003
>>> RBP: 0000000000000003 R08: 000055f9ce349fb0 R09: 0000000000000000
>>> R10: 00007ffcef12f8f0 R11: 0000000000000246 R12: 0000000000000002
>>> R13: 0000000007270e00 R14: 000055f99670c9b8 R15: 0000000000000002
>>>    </TASK>
>>> INFO: task modprobe:968 blocked for more than 180 seconds.
>>>         Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe        state:D stack:0     pid:968   tgid:968   ppid:65     flags:0x00000000
>>> Call Trace:
>>>    <TASK>
>>>    __schedule+0x43d/0xe20
>>>    schedule+0x31/0x130
>>>    schedule_timeout+0x1b9/0x1d0
>>>    ? __wait_for_common+0xb0/0x1d0
>>>    ? lock_release+0xc6/0x290
>>>    ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>    __wait_for_common+0xb9/0x1d0
>>>    ? usleep_range_state+0xb0/0xb0
>>>    idempotent_init_module+0x1ae/0x290
>>>    __x64_sys_finit_module+0x55/0xb0
>>>    do_syscall_64+0x6c/0x170
>>>    entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7fde25530ddd
>>> RSP: 002b:00007fffac078518 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 0000558758e28ef0 RCX: 00007fde25530ddd
>>> RDX: 0000000000000000 RSI: 000055873cebf358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055873cebf358
>>> R13: 0000000000000000 R14: 0000558758e29020 R15: 0000558758e28ef0
>>>    </TASK>
>>> INFO: task modprobe:969 blocked for more than 180 seconds.
>>>         Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe        state:D stack:0     pid:969   tgid:969   ppid:93     flags:0x00000000
>>> Call Trace:
>>>    <TASK>
>>>    __schedule+0x43d/0xe20
>>>    schedule+0x31/0x130
>>>    schedule_timeout+0x1b9/0x1d0
>>>    ? __wait_for_common+0xb0/0x1d0
>>>    ? lock_release+0xc6/0x290
>>>    ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>    __wait_for_common+0xb9/0x1d0
>>>    ? usleep_range_state+0xb0/0xb0
>>>    idempotent_init_module+0x1ae/0x290
>>>    __x64_sys_finit_module+0x55/0xb0
>>>    do_syscall_64+0x6c/0x170
>>>    entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f338d516ddd
>>> RSP: 002b:00007ffd155cd1e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000056092cb0def0 RCX: 00007f338d516ddd
>>> RDX: 0000000000000000 RSI: 00005608ecb4a358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 00005608ecb4a358
>>> R13: 0000000000000000 R14: 000056092cb0e020 R15: 000056092cb0def0
>>>    </TASK>
>>> INFO: task modprobe:1044 blocked for more than 180 seconds.
>>>         Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe        state:D stack:0     pid:1044  tgid:1044  ppid:10     flags:0x00000000
>>> Call Trace:
>>>    <TASK>
>>>    __schedule+0x43d/0xe20
>>>    schedule+0x31/0x130
>>>    schedule_timeout+0x1b9/0x1d0
>>>    ? __wait_for_common+0xb0/0x1d0
>>>    ? lock_release+0xc6/0x290
>>>    ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>    __wait_for_common+0xb9/0x1d0
>>>    ? usleep_range_state+0xb0/0xb0
>>>    idempotent_init_module+0x1ae/0x290
>>>    __x64_sys_finit_module+0x55/0xb0
>>>    do_syscall_64+0x6c/0x170
>>>    entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f7637b30ddd
>>> RSP: 002b:00007ffe6251da78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000055b889cb3ef0 RCX: 00007f7637b30ddd
>>> RDX: 0000000000000000 RSI: 000055b854eea358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055b854eea358
>>> R13: 0000000000000000 R14: 000055b889cb4020 R15: 000055b889cb3ef0
>>>    </TASK>
>>> INFO: task modprobe:1047 blocked for more than 180 seconds.
>>>         Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe        state:D stack:0     pid:1047  tgid:1047  ppid:113    flags:0x00000000
>>> Call Trace:
>>>    <TASK>
>>>    __schedule+0x43d/0xe20
>>>    schedule+0x31/0x130
>>>    schedule_timeout+0x1b9/0x1d0
>>>    ? __wait_for_common+0xb0/0x1d0
>>>    ? lock_release+0xc6/0x290
>>>    ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>    __wait_for_common+0xb9/0x1d0
>>>    ? usleep_range_state+0xb0/0xb0
>>>    idempotent_init_module+0x1ae/0x290
>>>    __x64_sys_finit_module+0x55/0xb0
>>>    do_syscall_64+0x6c/0x170
>>>    entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f3907130ddd
>>> RSP: 002b:00007ffc36e4eb08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000056100a856ef0 RCX: 00007f3907130ddd
>>> RDX: 0000000000000000 RSI: 0000560fff0ec358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000560fff0ec358
>>> R13: 0000000000000000 R14: 000056100a857020 R15: 000056100a856ef0
>>>    </TASK>
>>> INFO: task modprobe:1056 blocked for more than 180 seconds.
>>>         Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe        state:D stack:0     pid:1056  tgid:1056  ppid:1045   flags:0x00000000
>>> Call Trace:
>>>    <TASK>
>>>    __schedule+0x43d/0xe20
>>>    schedule+0x31/0x130
>>>    schedule_timeout+0x1b9/0x1d0
>>>    ? __wait_for_common+0xb0/0x1d0
>>>    ? lock_release+0xc6/0x290
>>>    ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>    __wait_for_common+0xb9/0x1d0
>>>    ? usleep_range_state+0xb0/0xb0
>>>    idempotent_init_module+0x1ae/0x290
>>>    __x64_sys_finit_module+0x55/0xb0
>>>    do_syscall_64+0x6c/0x170
>>>    entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7fcb1e730ddd
>>> RSP: 002b:00007ffc692d0ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000055f8d8828ef0 RCX: 00007fcb1e730ddd
>>> RDX: 0000000000000000 RSI: 000055f8bff36358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055f8bff36358
>>> R13: 0000000000000000 R14: 000055f8d8829020 R15: 000055f8d8828ef0
>>>    </TASK>
>>> INFO: task modprobe:1058 blocked for more than 180 seconds.
>>>         Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe        state:D stack:0     pid:1058  tgid:1058  ppid:1051   flags:0x00000000
>>> Call Trace:
>>>    <TASK>
>>>    __schedule+0x43d/0xe20
>>>    schedule+0x31/0x130
>>>    schedule_timeout+0x1b9/0x1d0
>>>    ? __wait_for_common+0xb0/0x1d0
>>>    ? lock_release+0xc6/0x290
>>>    ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>    __wait_for_common+0xb9/0x1d0
>>>    ? usleep_range_state+0xb0/0xb0
>>>    idempotent_init_module+0x1ae/0x290
>>>    __x64_sys_finit_module+0x55/0xb0
>>>    do_syscall_64+0x6c/0x170
>>>    entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f0a17b30ddd
>>> RSP: 002b:00007fff56d619e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000055abd6741ef0 RCX: 00007f0a17b30ddd
>>> RDX: 0000000000000000 RSI: 000055abc6586358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 000055abc6586358
>>> R13: 0000000000000000 R14: 000055abd6742020 R15: 000055abd6741ef0
>>>    </TASK>
>>> INFO: task modprobe:1060 blocked for more than 181 seconds.
>>>         Not tainted 6.9.0-rc2+ #23
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:modprobe        state:D stack:0     pid:1060  tgid:1060  ppid:1057   flags:0x00000000
>>> Call Trace:
>>>    <TASK>
>>>    __schedule+0x43d/0xe20
>>>    schedule+0x31/0x130
>>>    schedule_timeout+0x1b9/0x1d0
>>>    ? __wait_for_common+0xb0/0x1d0
>>>    ? lock_release+0xc6/0x290
>>>    ? lockdep_hardirqs_on_prepare+0xd6/0x170
>>>    __wait_for_common+0xb9/0x1d0
>>>    ? usleep_range_state+0xb0/0xb0
>>>    idempotent_init_module+0x1ae/0x290
>>>    __x64_sys_finit_module+0x55/0xb0
>>>    do_syscall_64+0x6c/0x170
>>>    entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> RIP: 0033:0x7f12c0130ddd
>>> RSP: 002b:00007ffccdef0488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000056249db40ef0 RCX: 00007f12c0130ddd
>>> RDX: 0000000000000000 RSI: 0000562471e4d358 RDI: 0000000000000001
>>> RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: 0000000000000246 R12: 0000562471e4d358
>>> R13: 0000000000000000 R14: 000056249db41020 R15: 000056249db40ef0
>>>    </TASK>
>>>
>>> Showing all locks held in the system:
>>> 2 locks held by systemd/1:
>>>    #0: ffff88812a7a10a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x1f/0x50
>>>    #1: ffff88812a7a1130 (&tty->atomic_write_lock){+.+.}-{4:4}, at: file_tty_write.constprop.0+0xab/0x330
>>> 2 locks held by kworker/0:1/9:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900000afe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:0/10:
>>>    #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900000b7e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/3:0/37:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900001cbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:0/61:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc9000029be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:1/65:
>>>    #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900002bfe50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 1 lock held by khungtaskd/66:
>>>    #0: ffffffff8296e760 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x32/0x1c0
>>> 2 locks held by kworker/1:1/79:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc9000032fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:2/93:
>>>    #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900003d3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:1/94:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900003dbe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/3:1/96:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900003ebe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/1:2/102:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90000eabe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:3/107:
>>>    #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90000ed3e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:4/113:
>>>    #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90000f03e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:2/189:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90000e0fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:5/196:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90000f13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:6/197:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90000f23e50 ((work_completion)(&(&hda->probe_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/6:8/199:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90000f53e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:2/296:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc9000105be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:3/297:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90001043e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:4/298:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90001063e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/7:5/320:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90001003e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/2:2/371:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc9000104be50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/5:13/648:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc9000198fe50 ((deferred_probe_timeout_work).work){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/5:14/649:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90001997e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/5:15/650:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc9000199fe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/5:16/651:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900019a7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/4:3/722:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90001a27e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/1:4/768:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900010d7e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/1:5/769:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc900010dfe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/0:2/849:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90001353e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by lvm/860:
>>>    #0: ffff8881323c19a8 (&md->type_lock){+.+.}-{4:4}, at: table_load+0xc9/0x400
>>>    #1: ffff88813200c3b8 (&mddev->reconfig_mutex){+.+.}-{4:4}, at: raid_ctr+0x13b3/0x2860 [dm_raid]
>>> 2 locks held by modprobe/1019:
>>>    #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>>    #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>> 2 locks held by kworker/u32:5/1045:
>>>    #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90004367e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:6/1051:
>>>    #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90004703e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/u32:7/1057:
>>>    #0: ffff888120070948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90004a97e50 ((work_completion)(&sub_info->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/3:3/1111:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90005bafe50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>> 2 locks held by kworker/3:4/1132:
>>>    #0: ffff88812006c548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x41e/0x710
>>>    #1: ffffc90005e13e50 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at: process_one_work+0x1d1/0x710
>>>
>>> =============================================
>>>
>>>
>>
>> I ran a bisect on this.  The tagged bad commit is a LED related merge, but commit
>> shows no code changes when I look at it in git.  I double checked that the
>> merge is bad by manually going to it again at the end of the bisect and
>> indeed it fails.
>>
>>  From looking at lockdep, this below may be interesting.  I do have 24 intel be200 radios
>> in this system, so maybe lots of iwlwifi radios help trigger the problem?
>>
>>> 2 locks held by modprobe/1019:
>>>     #0: ffffffffa0ca7b68 (iwlwifi_opmode_table_mtx){+.+.}-{4:4}, at: iwl_opmode_register+0x27/0xd0 [iwlwifi]
>>>     #1: ffff888139f88270 (&led_cdev->led_access){+.+.}-{4:4}, at: led_classdev_register_ext+0x195/0x450
>>
>> Please let me know if you have any suggestions for how to debug this further.
>>
>> [greearb@...-dt5 linux-2.6]$ git bisect log
>> git bisect start
>> # status: waiting for both good and bad commits
>> # good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
>> git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
>> # status: waiting for bad commit, 1 good commit known
>> # bad: [4cece764965020c22cff7665b18a012006359095] Linux 6.9-rc1
>> git bisect bad 4cece764965020c22cff7665b18a012006359095
>> # good: [e5e038b7ae9da96b93974bf072ca1876899a01a3] Merge tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
>> git bisect good e5e038b7ae9da96b93974bf072ca1876899a01a3
>> # bad: [32a50540c3d26341698505998dfca5b0e8fb4fd4] Merge tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs
>> git bisect bad 32a50540c3d26341698505998dfca5b0e8fb4fd4
>> # good: [a3df5d5422b4edfcfe658d5057e7e059571e32ce] Merge tag 'pinctrl-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
>> git bisect good a3df5d5422b4edfcfe658d5057e7e059571e32ce
>> # bad: [c0a614e82ece41d15b7a66f43ee79f4dbdbc925a] Merge tag 'lsm-pr-20240314' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
>> git bisect bad c0a614e82ece41d15b7a66f43ee79f4dbdbc925a
>> # bad: [705c1da8fa4816fb0159b5602fef1df5946a3ee2] Merge tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
>> git bisect bad 705c1da8fa4816fb0159b5602fef1df5946a3ee2
>> # bad: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>> git bisect bad f5c31bcf604db54470868f3118a60dc4a9ba8813
>> # good: [8403ce70be339d462892a2b935ae30ee52416f92] Merge tag 'mfd-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
>> git bisect good 8403ce70be339d462892a2b935ae30ee52416f92
>> # good: [2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061] leds: expresswire: Don't depend on NEW_LEDS
>> git bisect good 2cd0d1db31e78a63553876f8e6a4c9dcc1f9c061
>> # good: [23749cf3dfff5dcd706183ade1d27198a37b3881] backlight: gpio: Simplify with dev_err_probe()
>> git bisect good 23749cf3dfff5dcd706183ade1d27198a37b3881
>> # good: [2c7c70f54f791ece44541a9254c1a73790fd4595] dt-bindings: leds: Add NCP5623 multi-LED Controller
>> git bisect good 2c7c70f54f791ece44541a9254c1a73790fd4595
>> # good: [c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8] backlight: lm3630a_bl: Simplify probe return on gpio request error
>> git bisect good c9128ed7b9edeb2b6f1faec06d96b2fd5bc72cb8
>> # good: [45066c4bbe8ca25f9f282245b84568116c783f1d] leds: ncp5623: Add MS suffix to time defines
>> git bisect good 45066c4bbe8ca25f9f282245b84568116c783f1d
>> # good: [f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c] Merge tag 'backlight-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
>> git bisect good f3d8f29d1f59230b8c2a09e6dee7db7bd295e42c
>> # first bad commit: [f5c31bcf604db54470868f3118a60dc4a9ba8813] Merge tag 'leds-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
>> [greearb@...-dt5 linux-2.6]$
>>
>> Thanks,
>> Ben
>>
> 


-- 
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc  http://www.candelatech.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ