[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250820162621.UiUQUcp1@linutronix.de>
Date: Wed, 20 Aug 2025 18:26:21 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Alan Stern <stern@...land.harvard.edu>
Cc: Yunseong Kim <ysk@...lloc.com>, linux-usb@...r.kernel.org,
gregkh@...uxfoundation.org,
Andrey Konovalov <andreyknvl@...gle.com>,
Shuah Khan <skhan@...uxfoundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
syzkaller@...glegroups.com
Subject: Re: [BUG] usbip: vhci: Sleeping function called from invalid context
in vhci_urb_enqueue on PREEMPT_RT
On 2025-08-19 11:44:52 [-0400], Alan Stern wrote:
> dummy-hcd is different from the others; its use of local_irq_save() is
> in the gadget giveback path, not the host giveback path. We would need
> another audit similar to the one you did for ed194d136769, but this
> time checking gadget completion handlers.
ach right.
> > Only RH code completes directly, everything else is shifted to softirq
> > context (for ehci/HCD_BH).
>
> Correct (except that RH code always uses softirq context; it never
> completes directly -- the kerneldoc is wrong and Greg just accepted a
> patch to change it).
Ach okay. I assumed it because the completion handler skips it. But that
might be a shortcut because it already is in softirq context.
> There are other places that disable local interrupts. ehci-brcm.c does
> it in order to meet a timing constraint. ohci-omap.c and ohci-s3c2410.c
> do it for reasons I'm not aware of (no comment about it in the source).
> gadget/legacy/inode.c does it in ep_aio_cancel() -- I can only guess
> that this is somehow related to aio and not to anything in USB.
the inode.c looks interesting.c. It got there in 75787d943ab37 ("[PATCH]
USB: gadgetfs AIO support") which is 2004. So it might assume !SMP in
terms of locking. Anyway…
> > The closest thing we have is local_lock_irq() which maps to
> > local_irq_disable() on !PREEMPT_RT systems. But I would prefer to avoid
> > it because it serves a different purpose.
> > What works is something like
> > spin_lock_irqsave();
> > spin_unlock();
> > $SOMETHING
> > spin_lock();
> > spin_unlock_irqestore().
>
> That's just silly. We should have something like this:
>
> #ifdef CONFIG_PREEMPT_RT
> static inline void local_irqsave_nonrt(unsigned long flags) {}
> static inline void local_irqrestore_nonrt(unsigned long flags) {}
> static inline void local_irq_disable_nonrt() {}
> static inline void local_irq_enable_nonrt() {}
> #else
> #define local_irqsave_nonrt local_irqsave
> #define local_irqrestore_nonrt local_irqrestore
> #define local_irq_disable_nonrt local_irq_disable
> #define local_irq_enable_nonrt local_irq_enable
> #endif
We managed over the years to get rid of each one of this instances/
requirements. The RT tree used to have
|#ifdef CONFIG_PREEMPT_RT_FULL
|# define local_irq_disable_nort() do { } while (0)
|# define local_irq_disable_rt() local_irq_disable()
|#else
|# define local_irq_disable_nort() local_irq_disable()
|# define local_irq_disable_rt() do { } while (0)
|#endif
which was removed as of v4.16.7-rt1.
The problem is usually that nobody knows why exactly interrupts are
disabled an what purpose it serves. Often the reasons is no longer there
but the code still does it.
> > The question is why should $SOMETHING be invoked with disabled
> > interrupts if the function was called from process context.
>
> More to the point, out of all the possible reasons why $SOMETHING might
> be invoked with disabled interrupts, which of these reasons remain valid
> in RT builds and which don't?
None (in most cases) because interrupt handler are threaded. So
interrupts are never truly disabled.
Adding the macros as you suggest would gain probably three users:
- inode
- dummy_hcd
- vhci-hcd
Instead I would:
- vhci I would suggest to remove the disabling and move its completion
to BH.
- dummy_hcd I would suggest to either do the thing you called silly or
audit the gadgets and remove it.
- inode I would suggest to keep it as-is and audit it properly later
once someone intends to use it. It would also give the opportunity to
clean up the commented locking statement.
> Alan Stern
Sebastian
Powered by blists - more mailing lists