[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR07MB53810A64B7DAB7CD20491377DDAFA@BYAPR07MB5381.namprd07.prod.outlook.com>
Date: Thu, 9 Nov 2023 14:48:08 +0000
From: Pawel Laszczak <pawell@...ence.com>
To: Peter Chen <peter.chen@...nel.org>
CC: "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: RE: [PATCH] usb: cdnsp: Fix deadlock issue during using NCM gadget
>
>
>On 23-11-08 10:31:25, Pawel Laszczak wrote:
>> The interrupt service routine registered for the gadget is a primary
>> handler which mask the interrupt source and a threaded handler which
>> handles the source of the interrupt. Since the threaded handler is
>> voluntary threaded, the IRQ-core does not disable bottom halves before
>> invoke the handler like it does for the forced-threaded handler.
>>
>> Due to changes in networking it became visible that a network gadget's
>> completions handler may schedule a softirq which remains unprocessed.
>> The gadget's completion handler is usually invoked either in hard-IRQ
>> or soft-IRQ context. In this context it is enough to just raise the
>> softirq because the softirq itself will be handled once that context is left.
>> In the case of the voluntary threaded handler, there is nothing that
>> will process pending softirqs. Which means it remain queued until
>> another random interrupt (on this CPU) fires and handles it on its
>> exit path or another thread locks and unlocks a lock with the bh suffix.
>> Worst case is that the CPU goes idle and the NOHZ complains about
>> unhandled softirqs.
>
>Would you have a diagram to explain how things happen, and when the
>network softirq is scheduled in this case?
I don't have any diagram, but similar fix has been also applied for dwc3 driver.
https://patchwork.kernel.org/project/netdevbpf/patch/Yg/YPejVQH3KkRVd@linutronix.de/
In patch content you can find link to some discussion about this issue.
The similar fix I'm going introduce for cdns3 and cdns2 drivers in near features.
Pawel
>
>Peter
>>
>> Disable bottom halves before acquiring the lock (and disabling
>> interrupts) and enable them after dropping the lock. This ensures that
>> any pending softirqs will handled right away.
>>
>> cc: <stable@...r.kernel.org>
>> Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence
>> USBSSP DRD Driver")
>> Signed-off-by: Pawel Laszczak <pawell@...ence.com>
>> ---
>> drivers/usb/cdns3/cdnsp-ring.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/usb/cdns3/cdnsp-ring.c
>> b/drivers/usb/cdns3/cdnsp-ring.c index 07f6068342d4..275a6a2fa671
>> 100644
>> --- a/drivers/usb/cdns3/cdnsp-ring.c
>> +++ b/drivers/usb/cdns3/cdnsp-ring.c
>> @@ -1529,6 +1529,7 @@ irqreturn_t cdnsp_thread_irq_handler(int irq,
>void *data)
>> unsigned long flags;
>> int counter = 0;
>>
>> + local_bh_disable();
>> spin_lock_irqsave(&pdev->lock, flags);
>>
>> if (pdev->cdnsp_state & (CDNSP_STATE_HALTED |
>CDNSP_STATE_DYING)) {
>> @@ -1541,6 +1542,7 @@ irqreturn_t cdnsp_thread_irq_handler(int irq,
>void *data)
>> cdnsp_died(pdev);
>>
>> spin_unlock_irqrestore(&pdev->lock, flags);
>> + local_bh_enable();
>> return IRQ_HANDLED;
>> }
>>
>> @@ -1557,6 +1559,7 @@ irqreturn_t cdnsp_thread_irq_handler(int irq,
>void *data)
>> cdnsp_update_erst_dequeue(pdev, event_ring_deq, 1);
>>
>> spin_unlock_irqrestore(&pdev->lock, flags);
>> + local_bh_enable();
>>
>> return IRQ_HANDLED;
>> }
>> --
>> 2.37.2
>>
>
>--
>
>Thanks,
>Peter Chen
Powered by blists - more mailing lists