lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zp5wOjhgK7HdPqsS@kbusch-mbp.dhcp.thefacebook.com>
Date: Mon, 22 Jul 2024 08:44:10 -0600
From: Keith Busch <kbusch@...nel.org>
To: Greg KH <gregkh@...uxfoundation.org>
Cc: Keith Busch <kbusch@...a.com>, rafael@...nel.org,
	linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
	bhelgaas@...gle.com, lukas@...ner.de
Subject: Re: [PATCH] driver core: get kobject ref when accessing dev_attrs

On Sat, Jul 20, 2024 at 07:17:55AM +0200, Greg KH wrote:
> On Fri, Jul 19, 2024 at 11:55:13AM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@...nel.org>
> > 
> > Get a reference to the device's kobject while storing and showing device
> > attributes so that the device is valid for the lifetime of the sysfs access.
> > Without this, the device may be released and use-after-free will occur.
> > 
> > This is an easy problem to recreate with pci switches. Basic topology on a my
> > qemu test machine:
> > 
> > -[0000:00]-+-00.0  Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> >            +-01.0-[01-04]----00.0-[02-04]--+-00.0-[03]--
> >                                            \-01.0-[04]----00.0  Red Hat, Inc. Virtio block device
> > 
> > Simultaneously remove devices 04:00.0 and 01:00.0 and you'll hit it:
> > 
> >  # echo 1 > /sys/bus/pci/devices/0000\:04\:00.0/remove &
> >  # echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/remove
> 
> So you remove the parent before the child and also want to remove the
> child at the same time?  You are going to have bad problems here :)

The example I provided is surely a user error, but it just demonstrates
the issue. The parent device can be removed at any time without user
action: hotplug and error handling take devices down automatically. And
it's not just a problem when requesting to concurrently removing the
child device; it's still a use-after-free from just accessing its
attributes.
 
> > @@ -2433,12 +2433,15 @@ static ssize_t dev_attr_show(struct kobject *kobj, struct attribute *attr,
> >  	struct device *dev = kobj_to_dev(kobj);
> >  	ssize_t ret = -EIO;
> >  
> > +	if (!kobject_get_unless_zero(kobj))
> > +		return -ENXIO;
> 
> We've been down this path before, and it doesn't end well from what I
> recall.  Attributes that when written to remove themselves need to call
> the correct function to do so (look at how scsi does it).  I think this
> change will now break that functionality.  Look in the email archives a
> long time ago for more details, I can't recall them at the moment,
> sorry.

Thanks for the suggestion. I'll try to figure out what scsi does and see
if that strategy can work here.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ