lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 30 Mar 2007 07:16:40 -0700
From:	Greg KH <gregkh@...e.de>
To:	Ingo Molnar <mingo@...e.hu>, Kay Sievers <kay.sievers@...y.org>
Cc:	Adrian Bunk <bunk@...sta.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [bug] hung bootup in various drivers, was: "2.6.21-rc5: known
	regressions"

On Fri, Mar 30, 2007 at 02:04:16PM +0200, Ingo Molnar wrote:
> 
> i just found a new category of driver regressions in 2.6.21, doing 
> allyesconfig bzImage bootup tests: the init methods of various drivers 
> hangs in driver_unregister().
> 
> It is caused by this problem: the semantics of driver_unregister() [also 
> implicitly called in pci_driver_unregister()] has apparently changed 
> recently. If a driver does:
> 
> 	pci_register_driver(&my_driver);
> 	...
> 	if (some_failure) {
> 		pci_unregister_driver(&my_driver);
> 		...
> 	}
> 
> it will hang the bootup in the following piece of code:
> 
>  drivers/base/driver.c:
> 
>   void driver_unregister(struct device_driver * drv)
>   {
>          bus_remove_driver(drv);
>          wait_for_completion(&drv->unloaded);
> 
> the completion is never done - because nobody removes the bus while the 
> init is still happening, obviously. (and bootup is serialized anyway)
> 
> now, the majority of drivers does the driver unregistry from its 
> module-cleanup function, so it's not affected by this problem. But if 
> you apply the debug patch attached further below, and do an allyesconfig 
> bzImage bootup, there's 3 hits already:
> 
>  BUG: at drivers/base/driver.c:187 driver_unregister()
>   [<c0105ff9>] show_trace_log_lvl+0x19/0x2e
>   [<c01063e2>] show_trace+0x12/0x14
>   [<c01063f8>] dump_stack+0x14/0x16
>   [<c063f7e6>] driver_unregister+0x3d/0x43
>   [<c0488048>] pci_unregister_driver+0x10/0x5f
>   [<c1b5f7c7>] slgt_init+0x9b/0x1ca
>   [<c1b31a2d>] init+0x15d/0x2bd
>   [<c0105bc3>] kernel_thread_helper+0x7/0x10
> 
>  BUG: at drivers/base/driver.c:187 driver_unregister()
>   [<c0105ff9>] show_trace_log_lvl+0x19/0x2e
>   [<c01063e2>] show_trace+0x12/0x14
>   [<c01063f8>] dump_stack+0x14/0x16
>   [<c063f7e6>] driver_unregister+0x3d/0x43
>   [<c0488048>] pci_unregister_driver+0x10/0x5f
>   [<c0619505>] init_ipmi_si+0x70a/0x738
>   [<c1b31a2d>] init+0x15d/0x2bd
>   [<c0105bc3>] kernel_thread_helper+0x7/0x10
> 
>  BUG: at drivers/base/driver.c:187 driver_unregister()
>   [<c0105ff9>] show_trace_log_lvl+0x19/0x2e
>   [<c01063e2>] show_trace+0x12/0x14
>   [<c01063f8>] dump_stack+0x14/0x16
>   [<c063f7e6>] driver_unregister+0x3d/0x43
>   [<c0488048>] pci_unregister_driver+0x10/0x5f
>   [<c1b6d2d8>] tlan_probe+0x2dd/0x30e
>   [<c1b31a2d>] init+0x15d/0x2bd
>   [<c0105bc3>] kernel_thread_helper+0x7/0x10
> 
> possibly more could trigger. Each of these 3 places caused an actual 
> bootup hang on my testbox, so these are real regressions and need to be 
> fixed.
> 
> because there are a good number of drivers that do 
> pci_unregister_device() from their init function, and because i cannot 
> see anything obviously wrong in doing an unregister call after a 
> failure, i think it's driver_unregister() that needs to be fixed. Greg, 
> what do you think?

Yes, we should allow the ability to call unregister_driver from within
the module_init function.

But I don't understand what is causing you to see this problem.  Who is
holding the reference on the struct device at this point in time?  Is it
the fact that userspace has some files open and it hasn't released them
yet?

I don't see anything implicit in the driver_unregister() path that
should not work from within the module_init() path.  Kay, am I missing
anything here?

(patch left below for Kay's benefit)

thanks,

greg k-h

> Index: linux/drivers/base/driver.c
> ===================================================================
> --- linux.orig/drivers/base/driver.c
> +++ linux/drivers/base/driver.c
> @@ -183,7 +183,8 @@ int driver_register(struct device_driver
>  void driver_unregister(struct device_driver * drv)
>  {
>  	bus_remove_driver(drv);
> -	wait_for_completion(&drv->unloaded);
> +	if (!drv->unloaded.done)
> +		WARN_ON(1);
>  }
>  
>  /**
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ