lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 Mar 2009 18:27:37 -0600
From:	Alex Chiang <achiang@...com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Greg KH <gregkh@...e.de>, Vegard Nossum <vegard.nossum@...il.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Ingo Molnar <mingo@...e.hu>, jbarnes@...tuousgeek.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH, RFC] sysfs: only allow one scheduled removal callback
	per kobj

* Tejun Heo <tj@...nel.org>:
> Alex Chiang wrote:
> > * Greg KH <gregkh@...e.de>:
> >> On Tue, Mar 10, 2009 at 05:20:27PM -0600, Alex Chiang wrote:
> >>> Hi Vegard, sysfs folks,
> >>>
> >>> Vegard was nice enough to test my PCI remove/rescan patches under
> >>> kmemcheck. Maybe "torture" is a more appropriate term. ;)
> >>>
> >>> My patch series introduces a sysfs "remove" attribute for PCI
> >>> devices, which will remove that device (and child devices).
> >>>
> >>> 	http://thread.gmane.org/gmane.linux.kernel.pci/3495
> >>>
> >>> Vegard decided that he wanted to do something like:
> >>>
> >>> 	# while true ; do echo 1 > /sys/bus/pci/devices/.../remove ; done
> >>>
> >>> which caused a nasty oops in my code. You can see the results of
> >>> his testing in the thread I referenced above.
> >>>
> >>> After looking at my code for a bit, I decided that maybe it
> >>> wasn't completely my fault. ;) See, I'm using device_schedule_callback()
> >> why?  Are you really in interrupt context here to need to do the remove
> >> at a later time?
> > 
> > What other interface can I use to remove objects from sysfs?
> 
> I haven't read your code yet but I seem to recall doing something
> similar.  Ah.. okay, this one didn't get in and I forgot about this.
> 
>   http://thread.gmane.org/gmane.linux.kernel/582130
> 
> But, yeah, committing suicide is currently quite hariy.  I tought SCSI
> did it correctly with all the grab/release dances.  Does SCSI have the
> problem too?

I haven't dived into the SCSI code yet, but they are doing some
sort of magic that I don't understand with their state machine.

Regardless, I think we have two issues.

	1. The existing callback mechanism that everyone hates
	has a "bug".

	2. Your suicide patches haven't made it into mainline yet.

The reason that I think that the "bug" is with the callback
mechanism is because any caller can repeatedly schedule suicide
over and over again, and the callback handler will eventually get
a stale pointer. Rather than make all the callsites handle the
locking, doesn't it make more sense for the infrastructure to do
it?

I realize we're trying to fix something that everyone wants to go
away, but the PCI rescan patches add some pretty useful
functionality and pretty much ready to go except for this. I
could add the bookkeeping into my suicide path, but that's
actually a slightly bigger patch, because now I have to malloc my
own callback structs. And again, I think it's more appropriate to
put that sort of code into the core.

Can we fix 1 in the short term and move towards 2 as the real
solution?

Thanks.

/ac

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists