lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 2 Feb 2019 00:29:38 +0000
From:   Thinh Nguyen <thinh.nguyen@...opsys.com>
To:     John Stultz <john.stultz@...aro.org>,
        Felipe Balbi <balbi@...nel.org>,
        Zeng Tao <prime.zeng@...ilicon.com>,
        Jack Pham <jackp@...eaurora.org>,
        Thinh Nguyen <thinh.nguyen@...opsys.com>,
        Chen Yu <chenyu56@...wei.com>
CC:     lkml <linux-kernel@...r.kernel.org>,
        Linux USB List <linux-usb@...r.kernel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: Frequent dwc3 crashes on suspend or reboot since 5.0-rc1

Hi John,

John Stultz wrote:
> Hey all,
>   Since the 5.0 merge window opened, I've been tripping on frequent
> dwc3 crashes on reboot and suspend, which I've added an example to the
> bottom of this mail.
>
> I've dug in a little bit and sort of have a sense of whats going on.
>
> In ffs_epfile_io():
> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_tree_drivers_usb_gadget_function_f-5Ffs.c-23n1065&d=DwIBaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=u9FYoxKtyhjrGFcyixFYqTjw1ZX0VsG2d8FCmzkTY-w&m=a8TU-itM8GBG_EARYf2yM-kVfCzmaPkKDNAUFQHTe3Q&s=BQiVAFiViSlxVg5_LemED0x_47FLVUD43M7R6h6T8qk&e=
>
> The completion done is setup on the stack:
>   DECLARE_COMPLETION_ONSTACK(done);
>
> Then later we setup a request and queue it:
>   req->context  = &done;
>   ...
>   ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC);
>
> Then wait for it:
>   if (unlikely(wait_for_completion_interruptible(&done))) {
>     /*
>     * To avoid race condition with ffs_epfile_io_complete,
>     * dequeue the request first then check
>     * status. usb_ep_dequeue API should guarantee no race
>     * condition with req->complete callback.
>     */
>     usb_ep_dequeue(ep->ep, req);
>     interrupted = ep->status < 0;
>   }
>
> The problem is, that we end up being interrupted, supposedly dequeue
> the request, and exit.
>
> But then (or in parallel) the irq triggers and we try calling
> complete() on the context pointer which points to now random stack
> space, which results in the panic.
>
> It seems like something is wrong with usb_ep_dequeue not really
> stopping the irq from happening?
>
> If I revert all the changes to dwc3 back to 4.20, I don't see the issue.
>
> I'll do some bisection to try to narrow things down, but I wanted to
> see if this was a known issue or if anyone had immediate ideas as to
> what might be wrong.
>

I'm not sure if this is related, but can you try to test using Felipe's
testing/next branch? There is a fix to a race condition when the gadget
driver tries to dequeue requests.

See if you run into this issue again.

Thanks,
Thinh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ