linux-kernel - Re: Problems with get_driver() and driver_attach() (and new

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120105180123.GA8035@core.coreip.homeip.net>
Date:	Thu, 5 Jan 2012 10:01:24 -0800
From:	Dmitry Torokhov <dmitry.torokhov@...il.com>
To:	Alan Stern <stern@...land.harvard.edu>
Cc:	Greg KH <greg@...ah.com>, Kay Sievers <kay.sievers@...y.org>,
	USB list <linux-usb@...r.kernel.org>,
	Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Problems with get_driver() and driver_attach() (and new_id too)

Hi Alan,

On Thu, Jan 05, 2012 at 11:31:00AM -0500, Alan Stern wrote:
> Greg and Kay:
> 
> There are some nasty problems connected with the driver core's
> get_driver(), put_driver(), and driver_attach().  Not just
> implementation bugs, but deeper conceptual difficulties.
> 
> Let's start with get_driver().  Its comment says that it increments the 
> driver's refcount, just like get_device() and a lot of other utility 
> routines.
> 
> But a struct driver is _not_ like a struct device!  It resembles a
> piece of code more than a piece of data -- it acts as an encapsulation
> of a driver.  Incrementing its refcount doesn't have much meaning
> because a driver's lifetime isn't determined by the structure's
> refcount; it's determined by when the driver's module gets unloaded.
> 
> What really matters for a driver is whether or not it is registered.  
> Drivers expect, for example, that none of their methods will be called
> after driver_unregister() returns.  It doesn't matter if some other
> thread still holds a reference to the driver structure; that reference
> mustn't be used for accessing the driver code after unregistration.  
> And of course, driver_attach() does access the driver code, by calling 
> the probe routine.

Agree here.

> 
> An example where this is violated occurs in the usb-serial core.  Each
> serial driver module registers (at least) two driver structures, one on
> the usb_serial_bus and one on the usb_bus.  The usb_serial_driver
> structure contains a pointer to the usb_driver structure, and this
> pointer is passed to get_driver() when the serial driver's new_id sysfs
> attribute is written to.
> 
> Now, udev scripts are capable of writing to sysfs attributes very soon
> after the attribute is created.  In the case of USB serial drivers, we
> have a bug report of a situation where this write took place after the
> usb_serial_driver was registered but before the usb_driver was
> registered.  Thus, get_driver() was handed a pointer to a driver
> structure that had not even been initialized, let alone registered, and
> so naturally it crashed.
> 
> Almost as bad is what can happen when a driver is unregistered while
> some thread is holding a reference obtained from get_driver().  The
> reference prevents the driver structure from being freed, but it
> doesn't prevent the thread from calling driver_attach() after the
> unregistration is complete, at which time the driver code does not
> expect to be invoked.
> 
> To fix these problems, we need to change the semantics of get_driver()  
> and put_driver().  Instead of taking a reference to the driver
> structure, get_driver() should check whether the driver is currently
> registered.  If not, return NULL; otherwise, pin the driver (i.e.,
> block it from being unregistered) until put_driver() is called.

Or maybe we should just drop get_driver() and put_driver() and just make
sure that driver_attach() does not race with driver_unregister()?
I think pinning driver so that it can't be unregistered (and
consequently module unload hangs) its a mis-feature.

> 
> This will require some code auditing, because there are places where
> get_driver() is called without checking the return value (see
> drivers/pci/pci_driver.c:pci_add_dynid() for an example; there are
> others).  It should be marked __must_check.
> 
> Also, there are places that call driver_attach() without first calling
> get_driver() (see drivers/input/gameport/gameport.c,
> drivers/input/serio/serio.c, and drivers/char/agp/amd64-agp.c).  They
> may or may not be safe; I don't know.

Serio and gameport are safe as everyting is protected by serio_mutex so
it is not possible to yank the driver our while we are trying to attach
it to a device.

> 
> One more thing.  The new_id sysfs attribute can cause problems of its 
> own.  Writes to it cause a dynamic ID structure to be allocated, and 
> these structures will leak unless they are properly deallocated.  
> Normally they are freed when the driver is unregistered.  But what if 
> registration fails to begin with?  It might fail at a point after the 
> new_id attribute was created, which means the attribute could have been 
> written to.  The dynamic IDs need to be freed after registration fails, 
> but nobody does this currently.
> 

Don't we create corresponding sysfs attributes only after driver
successfully registered? And attributes are the only way to add (and
thus allocate) new ids so I do not see why we'd be leaking here.

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/