lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1df6e9be-2233-a0b2-1ddc-76de9d62a397@metafoo.de>
Date:   Tue, 10 Mar 2020 14:45:14 +0100
From:   Lars-Peter Clausen <lars@...afoo.de>
To:     "Ardelean, Alexandru" <alexandru.Ardelean@...log.com>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "balbi@...nel.org" <balbi@...nel.org>
Cc:     "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "bigeasy@...utronix.de" <bigeasy@...utronix.de>,
        "m.olbrich@...gutronix.de" <m.olbrich@...gutronix.de>
Subject: Re: [PATCH][RESEND] usb: dwc3: gadget: Handle dequeuing of non queued
 URB gracefully

On 3/10/20 2:22 PM, Ardelean, Alexandru wrote:
> On Thu, 2020-01-30 at 14:02 +0200, Felipe Balbi wrote:
>> [External]
>>
>>
>> Hi,
>>
>> Alexandru Ardelean <alexandru.ardelean@...log.com> writes:
>>
>>> From: Lars-Peter Clausen <lars@...afoo.de>
>>>
>>> Trying to dequeue and URB that is currently not queued should be a no-op
>>> and be handled gracefully.
>>>
>>> Use the list field of the URB to indicate whether it is queued or not by
>>> setting it to the empty list when it is not queued.
>>>
>>> Handling this gracefully allows for race condition free synchronization
>>> between the complete callback being called to to a completed transfer and
>>> trying to call usb_ep_dequeue() at the same time.
>> We need a little more information here. Can you further explain what
>> happens and how you caught this?
> Apologies for the delay [of this reply].
> It's been a while since this patch was created, and it was on a 4.14 kernel.
> Lars was trying to fix various crashes with USB DWC3 OTG + some Xilinx patches.
> I did not track the status of the OTG stuff upstream. I think it's a lot of
> patches in the Xilinx tree.
>
> The context has changed from 4.14 [obviously], and there were many things that
> could have influenced things.
> I've been trying to RFC some of these patches now.
> [ yeah I know: maybe I should have [probably] also added an RFC tag :) ]
> Some of the patches [including this one] seemed to make sense, even outside of
> the context of the crashes that were happening on 4.14.
> Atm, we're at 4.19 and we don't see issues, but we still have this patch.
> We may drop it and see what happens.
> ¯\_(ツ)_/¯
>
> But in any case, it does require a bit more re-investigation.
> Apologies for the noise that this patch created :)

The race condition is between a gadget calling usb_ep_dequeue() and the 
driver completing the URB.

Lets say in a thread you have a reference to a in-flight URB and you 
want to abort the request, e.g. because the application that sent the 
request has been closed. But concurrently to that the URB is completed 
by the hardware and the interrupt fires and marks the URB as complete. 
Your thread is suspended while the interrupt is running, once the 
interrupt has finished the thread wakes up, still has the reference to 
the URB, but now it has been completed. The thread still calls 
usb_ep_dequeue() though and then undefined behavior occurs.

The context in which we observed the issue is when using function fs to 
create a userspace gadget and using aio_cancel() to abort a pending URB. 
But really any gadget that aborts a transfer before it is completed or 
before the timeout occurred can run into this issue.

- Lars

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ