[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOuDEK0o2jqqOUZVUdi9JDcyXRQHEuL9GCBrU0VQhWAfDtJnUg@mail.gmail.com>
Date: Mon, 2 Feb 2026 18:03:00 +0800
From: Guan-Yu Lin <guanyulin@...gle.com>
To: Greg KH <gregkh@...uxfoundation.org>
Cc: mathias.nyman@...el.com, stern@...land.harvard.edu,
wesley.cheng@....qualcomm.com, linux-usb@...r.kernel.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [RFC PATCH] usb: host: xhci-sideband: fix deadlock in unregister path
On Sat, Jan 31, 2026 at 8:15 PM Greg KH <gregkh@...uxfoundation.org> wrote:
>
> On Fri, Jan 30, 2026 at 07:47:46AM +0000, Guan-Yu Lin wrote:
> > When a USB device is disconnected or a driver is unbound, the USB core
> > invokes the driver's disconnect callback while holding the udev device
> > lock. If the driver calls xhci_sideband_unregister(), it eventually
> > reaches usb_offload_put(), which attempts to acquire the same udev
> > lock, resulting in a self-deadlock.
> >
> > Introduce lockless variants __usb_offload_get() and __usb_offload_put()
> > to allow modifying the offload usage count when the device lock is
> > already held. These helpers use device_lock_assert() to ensure callers
> > meet the locking requirements.
>
> Ugh. Didn't I warn about this when the original functions were added?
>
> Adding functions with __ is a mess, please make these names, if you
> _REALLY_ need them, obvious that this is a no lock function.
>
> And now that you added the lockless functions, are there any in-kernel
> users of the locked versions? At a quick glance I didn't see them, did
> I miss it somewhere?
>
> thanks,
>
> greg k-h
Hi Greg,
You are right; xhci-sideband.c is the only in-kernel user of the
locked versions. I will rename the __ functions to usb_offload_* and
remove the locked variants in the next version to clean up the API.
Regarding the deadlock fix itself, we have analyzed two potential
solutions. The core issue is that xhci_sideband_unregister() (and
xhci_sideband_remove_interrupter()) needs to decrement the offload
usage count (which requires the USB device lock), but it is called
from the disconnect path where that lock is already held.
Option A: Fix the Callers of xhci_sideband functions
In this approach, we keep the usb_offload calls inside the
xhci_sideband functions but replace the internal usb_lock_device()
with device_lock_assert(). We then update callers in
qc_audio_offload.c to explicitly acquire the lock.
This ensures the offload count remains tightly coupled with the
interrupter's lifecycle, though it effectively changes the API
contract: calling xHCI sideband functions now requires holding the USB
device lock.
Option B: Decouple usb_offload functions from xhci_sideband functions
In this approach, we strip the usb_offload calls out of xhci_sideband
functions entirely. The client driver (qc_audio_offload) becomes
responsible for the full transaction: acquiring the lock, managing
usb_offload_get/put(), and creating/removing the interrupter. This
restores clean encapsulation (xHCI functions only handle hardware),
but it places the burden on the client driver to correctly balance the
usb_offload calls.
I lean towards Option A to ensure the offload count implies an active
interrupter by design, but please let me know if you prefer the
cleaner separation of Option B.
Regards,
Guan-Yu
Powered by blists - more mailing lists