[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <bb7e34b7-c06b-4153-ba6c-009b9f1b34d0@rowland.harvard.edu>
Date: Tue, 19 Aug 2025 11:44:52 -0400
From: Alan Stern <stern@...land.harvard.edu>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Yunseong Kim <ysk@...lloc.com>, linux-usb@...r.kernel.org,
gregkh@...uxfoundation.org,
Andrey Konovalov <andreyknvl@...gle.com>,
Shuah Khan <skhan@...uxfoundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
syzkaller@...glegroups.com
Subject: Re: [BUG] usbip: vhci: Sleeping function called from invalid context
in vhci_urb_enqueue on PREEMPT_RT
On Tue, Aug 19, 2025 at 04:57:00PM +0200, Sebastian Andrzej Siewior wrote:
> > There are several places in the USB stack that disable local interrupts.
>
> But *why*? You need locking due to SMP. So it should be simply to avoid
> irqrestore()/ irqsave() during unlock/lock or to avoid deadlocks if a
> callback is invoked from IRQ and process context and the callback
> handler does simply spin_lock() (without the _irq suffix).
> The latter shouldn't be problem due to commit
> ed194d1367698 ("usb: core: remove local_irq_save() around ->complete() handler")
>
> So if completing the URB tasklet/ softirq context works for ehci/ xhci
> without creating any warning, it should also work for vhci, dummy_hcd.
dummy-hcd is different from the others; its use of local_irq_save() is
in the gadget giveback path, not the host giveback path. We would need
another audit similar to the one you did for ed194d136769, but this
time checking gadget completion handlers.
> Only RH code completes directly, everything else is shifted to softirq
> context (for ehci/HCD_BH).
Correct (except that RH code always uses softirq context; it never
completes directly -- the kerneldoc is wrong and Greg just accepted a
patch to change it).
There are other places that disable local interrupts. ehci-brcm.c does
it in order to meet a timing constraint. ohci-omap.c and ohci-s3c2410.c
do it for reasons I'm not aware of (no comment about it in the source).
gadget/legacy/inode.c does it in ep_aio_cancel() -- I can only guess
that this is somehow related to aio and not to anything in USB.
> | git grep -E 'local_irq_save|local_irq_disable' drivers/usb/ | wc -l
> | 21
> of which 10 are in pxa udc. The only one I am a bit concerned about is
> the one in usb_hcd_pci_remove() and I think we had reports and patches
> but somehow nothing did happen and I obviously forgot.
>
> > I would expect that RT already defines functions which do this, but I
> > don't know their names.
>
> We don't have anything where
> local_irq_disable()
> spin_lock()
>
> can be mapped to something equivalent other than
> spin_lock_irq()
>
> I was running around and kept changing code so that we don't end up in
> this scenario where we need to disable interrupts for some reason but on
> RT we don't.
>
> The closest thing we have is local_lock_irq() which maps to
> local_irq_disable() on !PREEMPT_RT systems. But I would prefer to avoid
> it because it serves a different purpose.
> What works is something like
> spin_lock_irqsave();
> spin_unlock();
> $SOMETHING
> spin_lock();
> spin_unlock_irqestore().
That's just silly. We should have something like this:
#ifdef CONFIG_PREEMPT_RT
static inline void local_irqsave_nonrt(unsigned long flags) {}
static inline void local_irqrestore_nonrt(unsigned long flags) {}
static inline void local_irq_disable_nonrt() {}
static inline void local_irq_enable_nonrt() {}
#else
#define local_irqsave_nonrt local_irqsave
#define local_irqrestore_nonrt local_irqrestore
#define local_irq_disable_nonrt local_irq_disable
#define local_irq_enable_nonrt local_irq_enable
#endif
> The question is why should $SOMETHING be invoked with disabled
> interrupts if the function was called from process context.
More to the point, out of all the possible reasons why $SOMETHING might
be invoked with disabled interrupts, which of these reasons remain valid
in RT builds and which don't?
> If your concern is a missing _irqsave() in the callback then this
> shouldn't be an issue. If it is the wrong context from kcov's point of
> view then making the driver complete in tasklet should fix it since it
> would match what ehci/ xhci does.
No, I'm not concerned about that.
Alan Stern
Powered by blists - more mailing lists