lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250201000926.veukjr6wmhrgl6me@synopsys.com>
Date: Sat, 1 Feb 2025 00:09:36 +0000
From: Thinh Nguyen <Thinh.Nguyen@...opsys.com>
To: Badhri Jagan Sridharan <badhri@...gle.com>
CC: Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "felipe.balbi@...ux.intel.com" <felipe.balbi@...ux.intel.com>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "jameswei@...gle.com" <jameswei@...gle.com>,
        "stable@...nel.org" <stable@...nel.org>
Subject: Re: [PATCH v1] usb: dwc3: gadget: Prevent irq storm when TH
 re-executes

On Fri, Jan 31, 2025, Badhri Jagan Sridharan wrote:
> On Wed, Jan 29, 2025 at 5:42 PM Thinh Nguyen <Thinh.Nguyen@...opsys.com> wrote:
> >
> > On Wed, Jan 29, 2025, Badhri Jagan Sridharan wrote:
> > > On Mon, Jan 27, 2025 at 6:44 PM Thinh Nguyen <Thinh.Nguyen@...opsys.com> wrote:
> > > >
> > > > On Fri, Jan 24, 2025, Badhri Jagan Sridharan wrote:
> > > > > While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events
> > > > > in event cache") makes sure that top half(TH) does not end up overwriting
> > > > > the cached events before processing them when the TH gets invoked more
> > > > > than one time, returning IRQ_HANDLED results in occasional irq storm
> > > > > where the TH hogs the CPU. The irq storm can be prevented if
> > > > > IRQ_WAKE_THREAD is returned.
> > > > >
> > > > > ftrace event stub during dwc3 irq storm:
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled
> > > > >     ....
> > > > >
> > > > > Cc: stable@...nel.org
> > > > > Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache")
> > > >
> > > > I don't think this should be Cc to stable, at least not the way it is
> > > > right now.
> > > >
> > > > > Signed-off-by: Badhri Jagan Sridharan <badhri@...gle.com>
> > > > > ---
> > > > >  drivers/usb/dwc3/gadget.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > > index d27af65eb08a..376ab75adc4e 100644
> > > > > --- a/drivers/usb/dwc3/gadget.c
> > > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > > @@ -4519,7 +4519,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> > > > >        * losing events.
> > > > >        */
> > > > >       if (evt->flags & DWC3_EVENT_PENDING)
> > > > > -             return IRQ_HANDLED;
> > > > > +             return IRQ_WAKE_THREAD;
> > > >
> > > > This looks like a workaround to the issue we have. Have you tried to
> > > > enable imod instead? It's the feature to help avoid these kind of issue.
> > >
> > > Hi Thinh,
> > >
> > > Thanks for the feedback ! Yes, we have been experimenting with the
> > > interrupt moderation interval as well.
> > > Have follow up questions though, please bear with me !
> > >
> > > 1. Given that when DWC3_EVENT_PENDING  is still set,
> > > dwc3_check_event_buf() is still waiting for the previous cached events
> > > to be processed by the dwc3_thread_interrupt(), what's the reasoning
> > > behind returning IRQ_HANDLED here ? Shouldn't we be returning
> > > IRQ_WAKE_THREAD anyways ?
> >
> > Currently dwc3 is implemented such that it will not process new events
> > until the BH is done with its work. The DWC3_EVENT_PENDING flag
> > indicates when the events are processed. With this expectation, we
> > should not schedule the BH as the events are still being handled.
> 
> 
> Hi Thinh,
> 
> Thanks for sharing your thoughts !
> Given that the intention of the design is to keep top half
> (dwc3_check_event_buf()) and bottom half (dwc3_process_event_buf())
> mutually exclusive, Is there a reason why the threaded interrupt
> should not be marked with IRQF_ONESHOT ? Marking it IRQF_ONESHOT makes
> the threaded irq framework to ensure mutual exclusivity for us. I
> was validating this and this seems to be pretty effective. Curious to
> know your thoughts !


We shouldn't do that. There are designs with phy driver or connector
driver or other devices that share the same interrupt line.

> 
> >
> >
> > In your case, there's a small window where the TH may be scheduled but
> > the DWC3_EVENT_PENDING flag is not cleared yet. Returning
> > IRQ_WAKE_THREAD may workaround your issue because the BH may already be
> > running when DWC3_EVENT_PENDING is set. I'm not sure what other side
> > effect this may have since we're breaking this expectation.
> >
> > We may enhance the dwc3 handling of event flow in the future to improve
> > this. But at the moment, we should not return IRQ_WAKE_THREAD here.
> >
> > >
> > > 2. When interrupt moderation is enabled, does DEVICE_IMODC start to
> > > decrement as soon as the interrupt is masked (where I expect that the
> > > interrupt line gets de-asserted by the controller) in
> > > dwc3_check_event_buf()  ?
> > >
> > > /* Mask interrupt */
> > > dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
> > >    DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length));
> > >
> >
> > The DEVICE_IMODC is loaded with DEVICE_IMODI and starts to decrement as
> > soon as the interrupt is de-asserted from the asserted state, which
> > includes when the interrupt is masked. You brought up a good question
> > here. The IMOD count may already be 0 when we exit the BH. Can you try
> > this experiment to update the count and let me know if it helps:
> 
> 
> Gave this a try, unfortunately this does not seem to help ! I see what
> you are trying to do though. You are trying to explicitly re-arm the
> timer. I was checking the register description as well and it does not seem to
> guarantee that directly writing to DEVICE_IMODC restarts the counter
> again.
> 

Hmm... Can you try this instead:

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 0fe92c0fb520..c1b5a3742ab4 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -5737,14 +5737,20 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt)
        dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
                    DWC3_GEVNTSIZ_SIZE(evt->length));
 
+       /*
+        * Keep the clearing of DWC3_EVENT_PENDING after the interrupt unmask
+        * but before the clearing of DWC3_GEVNTCOUNT_EHB.
+        */
+       evt->flags &= ~DWC3_EVENT_PENDING;
+
+       /* Ensure the flag is updated before clearing DWC3_GEVNTCOUNT_EHB */
+       wmb();
+
        if (dwc->imod_interval) {
                dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), DWC3_GEVNTCOUNT_EHB);
                dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0), dwc->imod_interval);
        }
 
-       /* Keep the clearing of DWC3_EVENT_PENDING at the end */
-       evt->flags &= ~DWC3_EVENT_PENDING;
-
        return ret;
 }


This should solve the issue for controllers with IMOD enabled.

Thanks,
Thinh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ