lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240911002408.gr4fv5vkst7ukxd5@synopsys.com>
Date: Wed, 11 Sep 2024 00:24:10 +0000
From: Thinh Nguyen <Thinh.Nguyen@...opsys.com>
To: Selvarasu Ganesan <selvarasu.g@...sung.com>
CC: Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "jh0801.jung@...sung.com" <jh0801.jung@...sung.com>,
        "dh10.jung@...sung.com" <dh10.jung@...sung.com>,
        "naushad@...sung.com" <naushad@...sung.com>,
        "akash.m5@...sung.com" <akash.m5@...sung.com>,
        "rc93.raju@...sung.com" <rc93.raju@...sung.com>,
        "taehyun.cho@...sung.com" <taehyun.cho@...sung.com>,
        "hongpooh.kim@...sung.com" <hongpooh.kim@...sung.com>,
        "eomji.oh@...sung.com" <eomji.oh@...sung.com>,
        "shijie.cai@...sung.com" <shijie.cai@...sung.com>
Subject: Re: [PATCH] usb: dwc3: Potential fix of possible dwc3 interrupt storm

On Tue, Sep 10, 2024, Selvarasu Ganesan wrote:
> 
> On 9/7/2024 6:09 AM, Thinh Nguyen wrote:
> > On Sat, Sep 07, 2024, Selvarasu Ganesan wrote:
> >> Hi Thinh,
> >>
> >> I ran the code you recommended on our testing environment and was able
> >> to reproduce the issue one time.
> >>
> >> When evt->flags contains DWC3_EVENT_PENDING, I've included the following
> >> debugging information: I added this debug message at the start of
> >> dwc3_event_buffers_cleanup and dwc3_event_buffers_setup functions in
> >> during suspend and resume.
> >>
> >> The results were quite interesting . I'm curious to understand how
> >> evt->flags is set to DWC3_EVENT_PENDING, and along with DWC3_GEVNTSIZ is
> >> equal to 0x1000 during the suspend.
> > That is indeed strange.
> >
> >> Its means that the previous bottom-half handler prior to suspend might
> >> still be executing in the middle of the process.
> >>
> >> Could you please give your suggestions here? And let me know if anything
> >> want to test or additional details are required.
> >>
> >>
> >> ##DBG: dwc3_event_buffers_cleanup:
> >>    evt->length    :0x1000
> >>    evt->lpos      :0x20c
> >>    evt->count     :0x0
> >>    evt->flags     :0x1 // This is Unexpected if DWC3_GEVNTSIZ(0)(0xc408):
> >> 0x00001000. Its means that previous bottom-half handler may be still
> >> running in middle
> > Perhaps.
> >
> > But I doubt that's the case since it shouldn't take that long for the
> > bottom-half to be completed before the next resume yet the flag is still
> > set.
> >
> >>    DWC3_GEVNTSIZ(0)(0xc408)       : 0x00001000
> >>    DWC3_GEVNTCOUNT(0)(0xc40c)     : 0x00000000
> >>    DWC3_DCFG(0xc700)              : 0x00e008a8
> >>    DWC3_DCTL(0xc704)              : 0x0cf00a00
> >>    DWC3_DEVTEN(0xc708)            : 0x00000000
> >>    DWC3_DSTS(0xc70c)              : 0x00d20cd1
> >>
> > The controller status is halted. So there's no problem with
> > soft-disconnect. For the interrupt mask in GEVNTSIZ to be cleared,
> > that likely means that the bottom-half had probably completed.
> 
> Agree, But I am worrying on If the bottom-half is completed, then 
> DWC3_EVENT_PENDING must be cleared in evt->flags.
> Is there any possibility of a CPU reordering issue when updating 
> evt->flags in the bottom-half handler?.
> Should I try with wmb() when writing to evt->flags?

Assuming that the problem occurs after the bottom-half completed, there
should be implicit memory barrier. The memory operation should complete
before the release from spin_unlock complete. I don't think wmb() will
help.

> >
> >> ##DBG: dwc3_event_buffers_setup:
> >>    evt->length    :0x1000
> >>    evt->lpos      :0x20c
> > They fact that evt->lpos did not get updated tells me that there's
> > something wrong with memory access to your platform during suspend and
> > resume.
> 
> Are you expecting the evt->lpos value to be zero here? If so, this is 
> expected in our test setup because we avoid writing zero to evt->lpos as 
> part of dwc3_event_buffers_cleanup if evt->flags have a value of 1. This 

Oh ok. I did not know you made this modification.

> is simply to track the status of evt->lpos during suspend to resume when 
> evt->flags have a value of DWC3_EVENT_PENDING. The following test codes 
> for the reference.
> 
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -505,8 +505,20 @@ static int dwc3_alloc_event_buffers(struct dwc3 
> *dwc, unsigned int length)
>   int dwc3_event_buffers_setup(struct dwc3 *dwc)
>   {
>          struct dwc3_event_buffer        *evt;
> +       u32                             reg;
> 
>          evt = dwc->ev_buf;
> +
> +       if (evt->flags & DWC3_EVENT_PENDING) {
> +               pr_info("evt->length :%x\n", evt->length);
> +               pr_info("evt->lpos :%x\n", evt->lpos);
> +               pr_info("evt->count :%x\n", evt->count);
> +               pr_info("evt->flags :%x\n", evt->flags);
> +
> +               dwc3_exynos_reg_dump(dwc);
> +
> +       }
> +
>          evt->lpos = 0;
>          dwc3_writel(dwc->regs, DWC3_GEVNTADRLO(0),
>                          lower_32_bits(evt->dma));
> @@ -514,8 +526,10 @@ int dwc3_event_buffers_setup(struct dwc3 *dwc)
>                          upper_32_bits(evt->dma));
>          dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
>                          DWC3_GEVNTSIZ_SIZE(evt->length));
> -       dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0);
> 
> +       /* Clear any stale event */
> +       reg = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
> +       dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), reg);
>          return 0;
>   }
> 
> @@ -525,7 +539,16 @@ void dwc3_event_buffers_cleanup(struct dwc3 *dwc)
> 
>          evt = dwc->ev_buf;
> 
> -       evt->lpos = 0;
> +       if (evt->flags & DWC3_EVENT_PENDING) {
> +               pr_info("evt->length :%x\n", evt->length);
> +               pr_info("evt->lpos :%x\n", evt->lpos);
> +               pr_info("evt->count :%x\n", evt->count);
> +               pr_info("evt->flags :%x\n", evt->flags);
> +
> +               dwc3_exynos_reg_dump(dwc);
> +       } else {
> +               evt->lpos = 0;

I wasn't aware of this change.

> +       }
> 
> >
> >>    evt->count     :0x0
> >>    evt->flags     :0x1 // Still It's not clearing in during resume.
> >>
> >>    DWC3_GEVNTSIZ(0)(0xc408)       : 0x00000000
> >>    DWC3_GEVNTCOUNT(0)(0xc40c)     : 0x00000000
> >>    DWC3_DCFG(0xc700)              : 0x00080800
> >>    DWC3_DCTL(0xc704)              : 0x00f00000
> >>    DWC3_DEVTEN(0xc708)            : 0x00000000
> >>    DWC3_DSTS(0xc70c)              : 0x00d20001
> >>
> > Please help look into your platform to see what condition triggers this
> > memory access issue. If this is a hardware quirk, we can properly update
> > the change and note it to be so.
> 
> Sure I will try to figure it out. However, we are facing challenges in 
> reproducing the issue. There could be a delay in understanding the 
> conditions that trigger the memory issue if it is related to a memory issue.
> 
> >
> > Thanks,
> > Thinh
> >
> > (If possible, for future tests, please dump the dwc3 tracepoints. Many
> > thanks for the tests.)
> 
> I tried to get dwc3 traces in the failure case, but so far no instances 
> have been reported. Our testing is still in progress with enable dwc3 
> traces.
> 
> I will keep posting once I get the dwc3 traces in the failure scenario.
> 

Thanks for the update. I hope enabling of the driver tracepoints will
not impact the reproduction of the issue. With the driver log, we'll get
more clues to what was going on.

Thanks,
Thinh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ