linux-kernel - Re: [BUG] usbip: vhci: Sleeping function called from invalid context in vhci_urb_enqueue on PREEMPT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250820162621.UiUQUcp1@linutronix.de>
Date: Wed, 20 Aug 2025 18:26:21 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Alan Stern <stern@...land.harvard.edu>
Cc: Yunseong Kim <ysk@...lloc.com>, linux-usb@...r.kernel.org,
	gregkh@...uxfoundation.org,
	Andrey Konovalov <andreyknvl@...gle.com>,
	Shuah Khan <skhan@...uxfoundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Clark Williams <clrkwllms@...nel.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
	syzkaller@...glegroups.com
Subject: Re: [BUG] usbip: vhci: Sleeping function called from invalid context
 in vhci_urb_enqueue on PREEMPT_RT

On 2025-08-19 11:44:52 [-0400], Alan Stern wrote:
> dummy-hcd is different from the others; its use of local_irq_save() is 
> in the gadget giveback path, not the host giveback path.  We would need 
> another audit similar to the one you did for ed194d136769, but this 
> time checking gadget completion handlers.

ach right. 

> > Only RH code completes directly, everything else is shifted to softirq
> > context (for ehci/HCD_BH).
> 
> Correct (except that RH code always uses softirq context; it never 
> completes directly -- the kerneldoc is wrong and Greg just accepted a 
> patch to change it).

Ach okay. I assumed it because the completion handler skips it. But that
might be a shortcut because it already is in softirq context.

> There are other places that disable local interrupts.  ehci-brcm.c does 
> it in order to meet a timing constraint.  ohci-omap.c and ohci-s3c2410.c 
> do it for reasons I'm not aware of (no comment about it in the source).  
> gadget/legacy/inode.c does it in ep_aio_cancel() -- I can only guess 
> that this is somehow related to aio and not to anything in USB.

the inode.c looks interesting.c. It got there in 75787d943ab37 ("[PATCH]
USB: gadgetfs AIO support") which is 2004. So it might assume !SMP in
terms of locking. Anyway…

> > The closest thing we have is local_lock_irq() which maps to
> > local_irq_disable() on !PREEMPT_RT systems. But I would prefer to avoid
> > it because it serves a different purpose.
> > What works is something like
> > 	spin_lock_irqsave();
> > 	spin_unlock();
> > 	$SOMETHING
> > 	spin_lock();
> > 	spin_unlock_irqestore().
> 
> That's just silly.  We should have something like this:
> 
> #ifdef CONFIG_PREEMPT_RT
> static inline void local_irqsave_nonrt(unsigned long flags) {}
> static inline void local_irqrestore_nonrt(unsigned long flags) {}
> static inline void local_irq_disable_nonrt() {}
> static inline void local_irq_enable_nonrt() {}
> #else
> #define local_irqsave_nonrt	local_irqsave
> #define local_irqrestore_nonrt	local_irqrestore
> #define local_irq_disable_nonrt	local_irq_disable
> #define local_irq_enable_nonrt	local_irq_enable
> #endif

We managed over the years to get rid of each one of this instances/
requirements. The RT tree used to have
|#ifdef CONFIG_PREEMPT_RT_FULL
|# define local_irq_disable_nort()     do { } while (0)
|# define local_irq_disable_rt()               local_irq_disable()
|#else
|# define local_irq_disable_nort()     local_irq_disable()
|# define local_irq_disable_rt()               do { } while (0)
|#endif

which was removed as of v4.16.7-rt1.
The problem is usually that nobody knows why exactly interrupts are
disabled an what purpose it serves. Often the reasons is no longer there
but the code still does it.

> > The question is why should $SOMETHING be invoked with disabled
> > interrupts if the function was called from process context.
> 
> More to the point, out of all the possible reasons why $SOMETHING might 
> be invoked with disabled interrupts, which of these reasons remain valid 
> in RT builds and which don't?

None (in most cases) because interrupt handler are threaded. So
interrupts are never truly disabled.
Adding the macros as you suggest would gain probably three users:
- inode
- dummy_hcd
- vhci-hcd

Instead I would:
- vhci I would suggest to remove the disabling and move its completion
   to BH.
- dummy_hcd I would suggest to either do the thing you called silly or
  audit the gadgets and remove it.
- inode I would suggest to keep it as-is and audit it properly later
  once someone intends to use it. It would also give the opportunity to
  clean up the commented locking statement.

> Alan Stern

Sebastian