lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 30 Mar 2007 14:04:16 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Adrian Bunk <bunk@...sta.de>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Greg Kroah-Hartman <gregkh@...e.de>
Subject: [bug] hung bootup in various drivers, was: "2.6.21-rc5: known regressions"


i just found a new category of driver regressions in 2.6.21, doing 
allyesconfig bzImage bootup tests: the init methods of various drivers 
hangs in driver_unregister().

It is caused by this problem: the semantics of driver_unregister() [also 
implicitly called in pci_driver_unregister()] has apparently changed 
recently. If a driver does:

	pci_register_driver(&my_driver);
	...
	if (some_failure) {
		pci_unregister_driver(&my_driver);
		...
	}

it will hang the bootup in the following piece of code:

 drivers/base/driver.c:

  void driver_unregister(struct device_driver * drv)
  {
         bus_remove_driver(drv);
         wait_for_completion(&drv->unloaded);

the completion is never done - because nobody removes the bus while the 
init is still happening, obviously. (and bootup is serialized anyway)

now, the majority of drivers does the driver unregistry from its 
module-cleanup function, so it's not affected by this problem. But if 
you apply the debug patch attached further below, and do an allyesconfig 
bzImage bootup, there's 3 hits already:

 BUG: at drivers/base/driver.c:187 driver_unregister()
  [<c0105ff9>] show_trace_log_lvl+0x19/0x2e
  [<c01063e2>] show_trace+0x12/0x14
  [<c01063f8>] dump_stack+0x14/0x16
  [<c063f7e6>] driver_unregister+0x3d/0x43
  [<c0488048>] pci_unregister_driver+0x10/0x5f
  [<c1b5f7c7>] slgt_init+0x9b/0x1ca
  [<c1b31a2d>] init+0x15d/0x2bd
  [<c0105bc3>] kernel_thread_helper+0x7/0x10

 BUG: at drivers/base/driver.c:187 driver_unregister()
  [<c0105ff9>] show_trace_log_lvl+0x19/0x2e
  [<c01063e2>] show_trace+0x12/0x14
  [<c01063f8>] dump_stack+0x14/0x16
  [<c063f7e6>] driver_unregister+0x3d/0x43
  [<c0488048>] pci_unregister_driver+0x10/0x5f
  [<c0619505>] init_ipmi_si+0x70a/0x738
  [<c1b31a2d>] init+0x15d/0x2bd
  [<c0105bc3>] kernel_thread_helper+0x7/0x10

 BUG: at drivers/base/driver.c:187 driver_unregister()
  [<c0105ff9>] show_trace_log_lvl+0x19/0x2e
  [<c01063e2>] show_trace+0x12/0x14
  [<c01063f8>] dump_stack+0x14/0x16
  [<c063f7e6>] driver_unregister+0x3d/0x43
  [<c0488048>] pci_unregister_driver+0x10/0x5f
  [<c1b6d2d8>] tlan_probe+0x2dd/0x30e
  [<c1b31a2d>] init+0x15d/0x2bd
  [<c0105bc3>] kernel_thread_helper+0x7/0x10

possibly more could trigger. Each of these 3 places caused an actual 
bootup hang on my testbox, so these are real regressions and need to be 
fixed.

because there are a good number of drivers that do 
pci_unregister_device() from their init function, and because i cannot 
see anything obviously wrong in doing an unregister call after a 
failure, i think it's driver_unregister() that needs to be fixed. Greg, 
what do you think?

	Ingo

Index: linux/drivers/base/driver.c
===================================================================
--- linux.orig/drivers/base/driver.c
+++ linux/drivers/base/driver.c
@@ -183,7 +183,8 @@ int driver_register(struct device_driver
 void driver_unregister(struct device_driver * drv)
 {
 	bus_remove_driver(drv);
-	wait_for_completion(&drv->unloaded);
+	if (!drv->unloaded.done)
+		WARN_ON(1);
 }
 
 /**
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ