lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 10 Mar 2009 21:41:51 -0700
From:	Greg KH <gregkh@...e.de>
To:	Alex Chiang <achiang@...com>,
	Vegard Nossum <vegard.nossum@...il.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Ingo Molnar <mingo@...e.hu>, jbarnes@...tuousgeek.org,
	tj@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH, RFC] sysfs: only allow one scheduled removal callback
	per kobj

On Tue, Mar 10, 2009 at 05:20:27PM -0600, Alex Chiang wrote:
> Hi Vegard, sysfs folks,
> 
> Vegard was nice enough to test my PCI remove/rescan patches under
> kmemcheck. Maybe "torture" is a more appropriate term. ;)
> 
> My patch series introduces a sysfs "remove" attribute for PCI
> devices, which will remove that device (and child devices).
> 
> 	http://thread.gmane.org/gmane.linux.kernel.pci/3495
> 
> Vegard decided that he wanted to do something like:
> 
> 	# while true ; do echo 1 > /sys/bus/pci/devices/.../remove ; done
> 
> which caused a nasty oops in my code. You can see the results of
> his testing in the thread I referenced above.
> 
> After looking at my code for a bit, I decided that maybe it
> wasn't completely my fault. ;) See, I'm using device_schedule_callback()

why?  Are you really in interrupt context here to need to do the remove
at a later time?

> which really is a wrapper around sysfs_schedule_callback() which
> is the way that a sysfs attribute is supposed to remove itself to
> prevent deadlock.

Yeah, it's the scsi code that needed this mess :(

If at all possible, I would recommend not using it.

> The problem that Vegard's test exposed is that if you repeatedly
> call a sysfs attribute that's supposed to remove itself using
> device_schedule_callback, we'll keep scheduling work queue tasks
> with a kobj that we really want to release.
> 
> [nb, I bet that /sys/bus/scsi/devices/.../delete will exhibit the
> same problems]

I think it's harder to remove devices multiple times, as it isn't done
through sysfs, but from another external request.  But I could be wrong.

> This is very racy, and at some point, whatever remove handler
> we've scheduled with device_schedule_callback will end up
> referencing a freed kobj.
> 
> I came up with the below patch which changes the semantics of
> device/sysfs_schedule_callback. We now only allow one in-flight
> callback per kobj, and return -EBUSY if that kobj already has a
> callback scheduled for it.
> 
> This patch, along with my updated 07/11 patch in my series,
> prevents at least the first oops that Vegard reported, and I
> suspect it prevents the second kmemcheck error too, although I
> haven't tested under kmemcheck (yet*).
> 
> I'm looking for comments on the approach I took, specifically:
> 
> 	- are we ok with the new restriction I imposed?
> 	- is it ok to return -EBUSY to our callers?
> 	- is the simple linked list proof of concept
> 	  implementation going to scale too poorly?
> 
> To answer my own first two questions, I checked for callers of
> both device_ and sysfs_schedule_callback, and it looks like
> everyone is using it the same way: to schedule themselves for
> removal. That is, although the interface could be used to
> schedule any sort of callback, the only use case is the removal
> use case. I don't think it will be a problem to limit ourselves
> to one remove callback per kobj.

I agree.

> Maybe this patch really wants to be a new interface called
> sysfs_schedule_callback_once or _single, where we check for an
> already-scheduled callback for a kobj, and if we pass, then we
> simply continue on to the existing, unchanged
> sysfs_schedule_callback. I don't feel too strongly about creating
> a new interface, but my belief is that changing the semantics of
> the existing interface is probably the better solution.
> 
> My opinion on my own third question is that removing a device is
> not in the performance path, so a simple linked list is
> sufficient.
> 
> Depending on the feedback here, I'll resend this patch with a
> full changelog (and giving credit to Vegard/kmemcheck as Ingo
> requested I do) or I can rework it.

I have no objection to this change.  But I really would recommend not
using this interface at all if possible.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists