lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <c6c17f0d-b71d-4a44-bcef-2b65e4d634f7@kzalloc.com>
Date: Sat, 16 Aug 2025 10:29:34 +0900
From: Yunseong Kim <ysk@...lloc.com>
To: linux-usb@...r.kernel.org, gregkh@...uxfoundation.org,
 stern@...land.harvard.edu
Cc: Andrey Konovalov <andreyknvl@...gle.com>,
 Shuah Khan <skhan@...uxfoundation.org>, Thomas Gleixner
 <tglx@...utronix.de>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Clark Williams <clrkwllms@...nel.org>, Steven Rostedt <rostedt@...dmis.org>,
 linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
 syzkaller@...glegroups.com
Subject: [BUG] usbip: vhci: Sleeping function called from invalid context in
 vhci_urb_enqueue on PREEMPT_RT

While testing a PREEMPT_RT enabled kernel (based on v6.17.0-rc1),
I encountered a "BUG: sleeping function called from invalid context"
error originating from the USB/IP VHCI driver.

On PREEMPT_RT configurations, standard spin_lock() calls are replaced by
rt_spin_lock(). Since rt_spin_lock() may sleep when contended, it must not
be called from an atomic context (e.g., with interrupts disabled).

The issue occurs within the vhci_urb_enqueue function This function
explicitly disables local interrupts using local_irq_disable() immediately
before calling usb_hcd_giveback_urb(), adhering to HCD requirements.

This error reported after this work:
Link: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-linus&id=9528d32873b38281ae105f2f5799e79ae9d086c2

  kworker (hub_event)
      |
      v
  vhci_urb_enqueue() [drivers/usb/usbip/vhci_hcd.c]
      |
      |---> spin_unlock_irqrestore(&vhci->lock, flags);
      |     (Context: IRQs Enabled, Process Context)
      |---> local_irq_disable();
      |
      |     *** STATE CHANGE: IRQs Disabled (Atomic Context) ***
      |
      +-----> usb_hcd_giveback_urb() [drivers/usb/core/hcd.c]
              |
              v
              __usb_hcd_giveback_urb()
              |
              v
              mon_complete() [drivers/usb/mon/mon_main.c]
              |
              |---> spin_lock()  <--- Attempts to acquire lock
                    |
                    | [On PREEMPT_RT]
                    v
                    rt_spin_lock() [kernel/locking/spinlock_rt.c]
                    |
                    v
                    [May Sleep if contended]
                    |
      X <----------- BUG: Sleeping in atomic context (IRQs are disabled!)

      |
      |---> local_irq_enable();
            (Context: IRQs Enabled)

Stack trace excerpt:

 BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 11063, name: kworker/0:5
 preempt_count: 0, expected: 0
 RCU nest depth: 0, expected: 0
 CPU: 0 UID: 0 PID: 11063 Comm: kworker/0:5 Not tainted 6.17.0-rc1-00001-g1149a5db27c8-dirty #55 PREEMPT_RT 
 Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8ubuntu1 06/11/2025
 Workqueue: usb_hub_wq hub_event
 Call trace:
  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C)
  __dump_stack+0x30/0x40 lib/dump_stack.c:94
  dump_stack_lvl+0x148/0x1d8 lib/dump_stack.c:120
  dump_stack+0x1c/0x3c lib/dump_stack.c:129
  __might_resched+0x2e4/0x52c kernel/sched/core.c:8957
  __rt_spin_lock kernel/locking/spinlock_rt.c:48 [inline]
  rt_spin_lock+0xa8/0x1bc kernel/locking/spinlock_rt.c:57
  spin_lock include/linux/spinlock_rt.h:44 [inline]
  mon_bus_complete drivers/usb/mon/mon_main.c:134 [inline]
  mon_complete+0x5c/0x1fc drivers/usb/mon/mon_main.c:147
  usbmon_urb_complete include/linux/usb/hcd.h:738 [inline]
  __usb_hcd_giveback_urb+0x1e4/0x59c drivers/usb/core/hcd.c:1647
  usb_hcd_giveback_urb+0x100/0x364 drivers/usb/core/hcd.c:1745
  vhci_urb_enqueue+0x86c/0xc08 drivers/usb/usbip/vhci_hcd.c:818
  usb_hcd_submit_urb+0x2ec/0x1790 drivers/usb/core/hcd.c:1546
  usb_submit_urb+0xd3c/0x13ec drivers/usb/core/urb.c:581
  usb_start_wait_urb+0xf0/0x3c8 drivers/usb/core/message.c:59
  usb_internal_control_msg drivers/usb/core/message.c:103 [inline]
  usb_control_msg+0x1d0/0x350 drivers/usb/core/message.c:154
  hub_set_address drivers/usb/core/hub.c:4769 [inline]
  hub_port_init+0xbac/0x2094 drivers/usb/core/hub.c:5074
  hub_port_connect drivers/usb/core/hub.c:5495 [inline]
  hub_port_connect_change drivers/usb/core/hub.c:5706 [inline]
  port_event drivers/usb/core/hub.c:5870 [inline]
  hub_event+0x1de4/0x3c44 drivers/usb/core/hub.c:5952
  process_one_work kernel/workqueue.c:3236 [inline]
  process_scheduled_works+0x68c/0x1118 kernel/workqueue.c:3319
  worker_thread+0x834/0xc1c kernel/workqueue.c:3400
  kthread+0x5f4/0x754 kernel/kthread.c:463
  ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844

It occurs after going through the code below:

 static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flags)
 {
 
 	...
 
 no_need_unlink:
 	spin_unlock_irqrestore(&vhci->lock, flags);
 	if (!ret) {
 		/* usb_hcd_giveback_urb() should be called with
 		 * irqs disabled
 		 */
 		local_irq_disable(); // <--- Entering atomic context (IRQs disabled)
 		usb_hcd_giveback_urb(hcd, urb, urb->status);
 		local_irq_enable();
 	}
 	return ret;
 }

 static void mon_bus_complete(struct mon_bus *mbus, struct urb *urb, int status)
 {
 	...
 	spin_lock_irqsave(&mbus->lock, flags);
 	...
 }

When called with interrupts disabled, usb_hcd_giveback_urb() eventually
leads to mon_complete() in the USB monitoring, if usbmon is enabled,
via __usb_hcd_giveback_urb().

mon_complete() attempts to acquire a lock via spin_lock(), observed in the
trace within the inlined mon_bus_complete.

Because vhci_urb_enqueue has already disabled interrupts, calling the
potentially sleepable rt_spin_lock() within this atomic context is invalid
and triggers the kernel BUG.

I request a review and correction of this locking mechanism to ensure
stability on PREEMPT_RT configurations.  Kernel config, full logs, and
reproduction steps can be provided on request.


Thanks,

Best Regards,
Yunseong Kim


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ