lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANcMJZCEVxVLyFgLwK98hqBEdc0_n4P0x_K6Gih8zNH3ouzbJQ@mail.gmail.com>
Date:   Mon, 9 Aug 2021 14:04:13 -0700
From:   John Stultz <john.stultz@...aro.org>
To:     Wesley Cheng <wcheng@...eaurora.org>
Cc:     balbi@...nel.org, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-usb@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        jackp@...eaurora.org, Amit Pundir <amit.pundir@...aro.org>,
        YongQin Liu <yongqin.liu@...aro.org>,
        Todd Kjos <tkjos@...gle.com>
Subject: Re: [PATCH] usb: dwc3: gadget: Use list_replace_init() before
 traversing lists

On Thu, Jul 29, 2021 at 12:34 AM Wesley Cheng <wcheng@...eaurora.org> wrote:
>
> The list_for_each_entry_safe() macro saves the current item (n) and
> the item after (n+1), so that n can be safely removed without
> corrupting the list.  However, when traversing the list and removing
> items using gadget giveback, the DWC3 lock is briefly released,
> allowing other routines to execute.  There is a situation where, while
> items are being removed from the cancelled_list using
> dwc3_gadget_ep_cleanup_cancelled_requests(), the pullup disable
> routine is running in parallel (due to UDC unbind).  As the cleanup
> routine removes n, and the pullup disable removes n+1, once the
> cleanup retakes the DWC3 lock, it references a request who was already
> removed/handled.  With list debug enabled, this leads to a panic.
> Ensure all instances of the macro are replaced where gadget giveback
> is used.
>
> Example call stack:
>
> Thread#1:
> __dwc3_gadget_ep_set_halt() - CLEAR HALT
>   -> dwc3_gadget_ep_cleanup_cancelled_requests()
>     ->list_for_each_entry_safe()
>     ->dwc3_gadget_giveback(n)
>       ->dwc3_gadget_del_and_unmap_request()- n deleted[cancelled_list]
>       ->spin_unlock
>       ->Thread#2 executes
>       ...
>     ->dwc3_gadget_giveback(n+1)
>       ->Already removed!
>
> Thread#2:
> dwc3_gadget_pullup()
>   ->waiting for dwc3 spin_lock
>   ...
>   ->Thread#1 released lock
>   ->dwc3_stop_active_transfers()
>     ->dwc3_remove_requests()
>       ->fetches n+1 item from cancelled_list (n removed by Thread#1)
>       ->dwc3_gadget_giveback()
>         ->dwc3_gadget_del_and_unmap_request()- n+1
> deleted[cancelled_list]
>         ->spin_unlock
>
> Fix this condition by utilizing list_replace_init(), and traversing
> through a local copy of the current elements in the endpoint lists.
> This will also set the parent list as empty, so if another thread is
> also looping through the list, it will be empty on the next iteration.
>
> Fixes: d4f1afe5e896 ("usb: dwc3: gadget: move requests to cancelled_list")
> Signed-off-by: Wesley Cheng <wcheng@...eaurora.org>

Hey Wesley,
  Just as a heads up, since this patch just landed upstream, I've
bisected it down as causing a regression on the db845c/RB3 board.

After booting with mainline, I'm seeing attempts to connect via adb fail with:
  error: device offline

Running "adb devices" provides:
  List of devices attached
  c4e1189c        offline

After reverting this patch, I can properly connect via adb again, and
"adb devices" shows the expected output:
  List of devices attached
  c4e1189c        device


I've not been able to isolate what might be going on, as there's no
obvious errors in dmesg. Any suggestions to further debug this?

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ