[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87ikw4wv13.fsf@toke.dk>
Date: Tue, 13 Aug 2024 12:56:40 +0200
From: Toke Høiland-Jørgensen <toke@...e.dk>
To: syzbot <syzbot+e9b1ff41aa6a7ebf9640@...kaller.appspotmail.com>,
kvalo@...nel.org, linux-kernel@...r.kernel.org,
linux-wireless@...r.kernel.org, netdev@...r.kernel.org,
syzkaller-bugs@...glegroups.com, Felix Fietkau <nbd@....name>
Subject: Re: [syzbot] [wireless?] INFO: task hung in
ath9k_hif_usb_firmware_cb (3)
syzbot <syzbot+e9b1ff41aa6a7ebf9640@...kaller.appspotmail.com> writes:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: eb5e56d14912 Merge tag 'platform-drivers-x86-v6.11-2' of g..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=137edff9980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=e8a2eef9745ade09
> dashboard link: https://syzkaller.appspot.com/bug?extid=e9b1ff41aa6a7ebf9640
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/a6552acb8476/disk-eb5e56d1.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/5c0963cd33df/vmlinux-eb5e56d1.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/7ba7283f6380/bzImage-eb5e56d1.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+e9b1ff41aa6a7ebf9640@...kaller.appspotmail.com
>
> INFO: task kworker/0:7:5284 blocked for more than 143 seconds.
> Not tainted 6.11.0-rc2-syzkaller-00011-geb5e56d14912 #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/0:7 state:D stack:13232 pid:5284 tgid:5284 ppid:2 flags:0x00004000
> Workqueue: events request_firmware_work_func
> Call Trace:
> <TASK>
> context_switch kernel/sched/core.c:5188 [inline]
> __schedule+0x1800/0x4a60 kernel/sched/core.c:6529
> __schedule_loop kernel/sched/core.c:6606 [inline]
> schedule+0x14b/0x320 kernel/sched/core.c:6621
> schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6678
> __mutex_lock_common kernel/locking/mutex.c:684 [inline]
> __mutex_lock+0x6a4/0xd70 kernel/locking/mutex.c:752
> device_lock include/linux/device.h:1009 [inline]
> ath9k_hif_usb_firmware_fail drivers/net/wireless/ath/ath9k/hif_usb.c:1163 [inline]
> ath9k_hif_usb_firmware_cb+0x34a/0x4b0
> drivers/net/wireless/ath/ath9k/hif_usb.c:1296
Ugh. Okay, so ath9k_hif_usb_firmware_cb can recursively call another
firmware request, and if that fails (because it runs out of firmware
names to try), it will do a device_release_driver() from within the
firmware callback. Which takes a lock, and seems to deadlock.
It does seem odd to try to do an asynchronous driver release from within
a callback like this, so I'm not surprised that it deadlocks, really.
The question is whether this has ever worked - does anyone know?
Also, ath9k_htc_probe_device() has wait_for_target logic that depends on
speaking to the firmware; and it seems to tear everything down if that
fails. So my immediate thought is that we could just get rid of the
device_release_driver() from the firmware callback entirely, and just
rely on that timeout to tear things down. However, I am not well-versed
enough in the USB probe and device setup logic, so I am not sure if
there is any reason that wouldn't be enough. Anyone with a better grip
on these things care to chime in? :)
-Toke
Powered by blists - more mailing lists