lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87zgu5v8om.fsf@kernel.org>
Date:   Thu, 29 Jul 2021 11:09:57 +0300
From:   Felipe Balbi <balbi@...nel.org>
To:     Wesley Cheng <wcheng@...eaurora.org>,
        Alan Stern <stern@...land.harvard.edu>
Cc:     gregkh@...uxfoundation.org, linux-usb@...r.kernel.org,
        linux-kernel@...r.kernel.org, jackp@...eaurora.org
Subject: Re: [PATCH] usb: dwc3: gadget: Use list_replace_init() before
 traversing lists


Hi,

Wesley Cheng <wcheng@...eaurora.org> writes:

> The list_for_each_entry_safe() macro saves the current item (n) and
> the item after (n+1), so that n can be safely removed without
> corrupting the list.  However, when traversing the list and removing
> items using gadget giveback, the DWC3 lock is briefly released,
> allowing other routines to execute.  There is a situation where, while
> items are being removed from the cancelled_list using
> dwc3_gadget_ep_cleanup_cancelled_requests(), the pullup disable
> routine is running in parallel (due to UDC unbind).  As the cleanup
> routine removes n, and the pullup disable removes n+1, once the
> cleanup retakes the DWC3 lock, it references a request who was already
> removed/handled.  With list debug enabled, this leads to a panic.
> Ensure all instances of the macro are replaced where gadget giveback
> is used.
>
> Example call stack:
>
> Thread#1:
> __dwc3_gadget_ep_set_halt() - CLEAR HALT
>   -> dwc3_gadget_ep_cleanup_cancelled_requests()
>     ->list_for_each_entry_safe()
>     ->dwc3_gadget_giveback(n)
>       ->dwc3_gadget_del_and_unmap_request()- n deleted[cancelled_list]
>       ->spin_unlock
>       ->Thread#2 executes
>       ...
>     ->dwc3_gadget_giveback(n+1)
>       ->Already removed!
>
> Thread#2:
> dwc3_gadget_pullup()
>   ->waiting for dwc3 spin_lock
>   ...
>   ->Thread#1 released lock
>   ->dwc3_stop_active_transfers()
>     ->dwc3_remove_requests()
>       ->fetches n+1 item from cancelled_list (n removed by Thread#1)
>       ->dwc3_gadget_giveback()
>         ->dwc3_gadget_del_and_unmap_request()- n+1
> deleted[cancelled_list]
>         ->spin_unlock
>
> Fix this condition by utilizing list_replace_init(), and traversing
> through a local copy of the current elements in the endpoint lists.
> This will also set the parent list as empty, so if another thread is
> also looping through the list, it will be empty on the next iteration.
>
> Fixes: d4f1afe5e896 ("usb: dwc3: gadget: move requests to cancelled_list")
> Signed-off-by: Wesley Cheng <wcheng@...eaurora.org>
>
> ---
> Previous patchset:
> https://lore.kernel.org/linux-usb/1620716636-12422-1-git-send-email-wcheng@codeaurora.org/
> ---
>  drivers/usb/dwc3/gadget.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index a29a4ca..3ce6ed9 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -1926,9 +1926,13 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
>  {
>  	struct dwc3_request		*req;
>  	struct dwc3_request		*tmp;
> +	struct list_head		local;
>  	struct dwc3			*dwc = dep->dwc;
>  
> -	list_for_each_entry_safe(req, tmp, &dep->cancelled_list, list) {
> +restart:
> +	list_replace_init(&dep->cancelled_list, &local);

hmm, if the lock is held and IRQs disabled when this runs, then no other
threads will be able to append requests to the list which makes the
"restart" label unnecessary, no?

I wonder if we should release the lock and reenable interrupts after
replacing the head. The problem is that
dwc3_gadget_ep_cleanup_cancelled_requests() can run from the IRQ
handler.

Alan, could you provide your insight here? Do you think we should defer
this to a low priority tasklet or something along those lines?

> +	list_for_each_entry_safe(req, tmp, &local, list) {
>  		dwc3_gadget_ep_skip_trbs(dep, req);
>  		switch (req->status) {
>  		case DWC3_REQUEST_STATUS_DISCONNECTED:


-- 
balbi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ