[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6569c83a-11b0-7f13-4b4c-c0318780895c@samsung.com>
Date: Fri, 22 Jan 2021 13:11:07 +0100
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Johannes Berg <johannes@...solutions.net>,
linux-wireless@...r.kernel.org
Cc: netdev@...r.kernel.org, Oliver Neukum <oneukum@...e.com>,
Johannes Berg <johannes.berg@...el.com>
Subject: Re: [PATCH v2] cfg80211: avoid holding the RTNL when calling the
driver
Hi Johannes,
On 19.01.2021 10:21, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@...el.com>
>
> Currently, _everything_ in cfg80211 holds the RTNL, and if you
> have a slow USB device (or a few) you can get some bad lock
> contention on that.
>
> Fix that by re-adding a mutex to each wiphy/rdev as we had at
> some point, so we have locking for the wireless_dev lists and
> all the other things in there, and also so that drivers still
> don't have to worry too much about it (they still won't get
> parallel calls for a single device).
>
> Then, we can restrict the RTNL to a few cases where we add or
> remove interfaces and really need the added protection. Some
> of the global list management still also uses the RTNL, since
> we need to have it anyway for netdev management, but we only
> hold the RTNL for very short periods of time here.
>
> Signed-off-by: Johannes Berg <johannes.berg@...el.com>
This patch landed in today's (20210122) linux-next as commit
791daf8fc49a ("cfg80211: avoid holding the RTNL when calling the
driver"). Sadly, it causes deadlock with mwifiex driver. I think that
lockdep report describes it enough:
Bluetooth: vendor=0x2df, device=0x912e, class=255, fn=2
cfg80211: Loading compiled-in X.509 certificates for regulatory database
cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
Bluetooth: FW download over, size 800344 bytes
btmrvl_sdio mmc2:0001:2: sdio device tree data not available
mwifiex_sdio mmc2:0001:1: WLAN is not the winner! Skip FW dnld
mwifiex_sdio mmc2:0001:1: WLAN FW is active
mwifiex_sdio mmc2:0001:1: CMD_RESP: cmd 0x242 error, result=0x2
mwifiex_sdio mmc2:0001:1: mwifiex_process_cmdresp: cmd 0x242 failed
during initialization
============================================
WARNING: possible recursive locking detected
5.11.0-rc4-00535-g791daf8fc49a #2336 Not tainted
--------------------------------------------
kworker/2:3/108 is trying to acquire lock:
c4f62b38 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: _mwifiex_fw_dpc+0x2c0/0x49c
[mwifiex]
but task is already holding lock:
c4f62b38 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: _mwifiex_fw_dpc+0x248/0x49c
[mwifiex]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&rdev->wiphy.mtx);
lock(&rdev->wiphy.mtx);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by kworker/2:3/108:
#0: c1c066a8 ((wq_completion)events){+.+.}-{0:0}, at:
process_one_work+0x24c/0x888
#1: deccbf10 ((work_completion)(&fw_work->work)){+.+.}-{0:0}, at:
process_one_work+0x24c/0x888
#2: c13202dc (rtnl_mutex){+.+.}-{3:3}, at: _mwifiex_fw_dpc+0x23c/0x49c
[mwifiex]
#3: c4f62b38 (&rdev->wiphy.mtx){+.+.}-{3:3}, at:
_mwifiex_fw_dpc+0x248/0x49c [mwifiex]
stack backtrace:
CPU: 2 PID: 108 Comm: kworker/2:3 Not tainted
5.11.0-rc4-00535-g791daf8fc49a #2336
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events request_firmware_work_func
[<c01116e8>] (unwind_backtrace) from [<c010cf58>] (show_stack+0x10/0x14)
[<c010cf58>] (show_stack) from [<c0b3ad3c>] (dump_stack+0xa4/0xc4)
[<c0b3ad3c>] (dump_stack) from [<c0195fd8>] (__lock_acquire+0xc20/0x31cc)
[<c0195fd8>] (__lock_acquire) from [<c019923c>] (lock_acquire+0x2e4/0x5dc)
[<c019923c>] (lock_acquire) from [<c0b4217c>] (__mutex_lock+0xa4/0xb60)
[<c0b4217c>] (__mutex_lock) from [<c0b42c54>] (mutex_lock_nested+0x1c/0x24)
[<c0b42c54>] (mutex_lock_nested) from [<bf1c87f8>]
(_mwifiex_fw_dpc+0x2c0/0x49c [mwifiex])
[<bf1c87f8>] (_mwifiex_fw_dpc [mwifiex]) from [<c06bfd18>]
(request_firmware_work_func+0x58/0x94)
[<c06bfd18>] (request_firmware_work_func) from [<c0149d48>]
(process_one_work+0x30c/0x888)
[<c0149d48>] (process_one_work) from [<c014a31c>] (worker_thread+0x58/0x594)
[<c014a31c>] (worker_thread) from [<c0151284>] (kthread+0x154/0x19c)
[<c0151284>] (kthread) from [<c010011c>] (ret_from_fork+0x14/0x38)
Exception stack(0xdeccbfb0 to 0xdeccbff8)
...
> ...
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Powered by blists - more mailing lists