[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250226082931.-XRIDa6D@linutronix.de>
Date: Wed, 26 Feb 2025 09:29:31 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Ralph Siemsen <ralph.siemsen@...aro.org>
Cc: linux-rt-devel@...ts.linux.dev, linux-usb@...r.kernel.org,
linux-kernel@...r.kernel.org, Pawel Laszczak <pawell@...ence.com>,
Frank Li <Frank.Li@....com>, Ferry Toth <ftoth@...londelft.nl>,
Thomas Gleixner <tglx@...utronix.de>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [RFC PATCH] usb: gadget: u_ether: prevent deadlock under RT
On 2025-02-19 13:15:52 [-0500], Ralph Siemsen wrote:
> [ 71.106003] task:irq/507-s-f4000 state:D stack:0 pid:125 tgid:125 ppid:2 flags:0x00000008
> [ 71.106018] Call trace:
> [ 71.106022] __switch_to+0xf4/0x158
> [ 71.106046] __schedule+0x2b4/0x920
> [ 71.106055] schedule_rtlock+0x24/0x50
> [ 71.106064] rtlock_slowlock_locked+0x348/0xcb8
> [ 71.106077] rt_spin_lock+0x88/0xb8
> [ 71.106086] eth_start_xmit+0x30/0x1490 [u_ether] /*****/
> [ 71.106112] ncm_tx_timeout+0x2c/0x50 [usb_f_ncm]
Explicit softirq timer as per HRTIMER_MODE_REL_SOFT
> [ 71.106131] __hrtimer_run_queues+0x180/0x378
> [ 71.106143] hrtimer_run_softirq+0x90/0x100
> [ 71.106151] handle_softirqs.isra.0+0x14c/0x360
> [ 71.106165] __local_bh_enable_ip+0x104/0x118
> [ 71.106177] __netdev_alloc_skb+0x1e0/0x210
> [ 71.106192] ncm_unwrap_ntb+0x1ec/0x528 [usb_f_ncm]
> [ 71.106206] rx_complete+0x120/0x288 [u_ether] /*****/
Network stack
> [ 71.106221] usb_gadget_giveback_request+0x34/0xf8
> [ 71.106236] cdns3_gadget_giveback+0xe4/0x2d0 [cdns3]
Returns URB to usb-core.
> [ 71.106286] cdns3_transfer_completed+0x3b0/0x630 [cdns3]
> [ 71.106320] cdns3_device_thread_irq_handler+0x8b8/0xd18 [cdns3]
Threaded interrupt handler. Not forced-threaded but voluntary threaded
due to devm_request_threaded_irq() usage.
> [ 71.106353] irq_thread_fn+0x34/0xb8
> [ 71.106364] irq_thread+0x180/0x2f0
> [ 71.106374] kthread+0x104/0x118
> [ 71.106384] ret_from_fork+0x10/0x20
>
> The deadlock occurs because eth_start_xmit() and rx_complete() both
> acquire the same spinlock in the same instance of struct eth_dev.
> The nested call occurs because rx_complete() calls __netdev_alloc_skb()
> which performs a brief local_bh_disable/enable() sequence.
…
Based on the backtrace the problem is within the cdns3. The driver
acquires at the beginning of its threaded routine
spin_lock_irqsave(&priv_dev->lock, flags);
and then before returning the URB it does
spin_unlock(&priv_dev->lock);
usb_gadget_giveback_request()
so the lock is dropped but the interrupts are still disabled. This makes
me wonder why using threaded interrupts at all since interrupts are
disabled for the whole routine but then others do the same.
If you look at dwc3_thread_interrupt() they have local_bh_disable()/
enable() before acquiring the lock and this is what I would suggest
doing here, too. The NCM is probably not the only one affected but
everything doing network that may since it may recourse into softirq.
Sebastian
Powered by blists - more mailing lists