lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121026152544.GC15840@kroah.com>
Date:	Fri, 26 Oct 2012 08:25:44 -0700
From:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To:	Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>
Cc:	"Rafael J. Wysocki" <rjw@...k.pl>, linux-acpi@...r.kernel.org,
	linux-kernel@...r.kernel.org, toshi.kani@...com, lenb@...nel.org,
	wency@...fujitsu.com, vasilis.liaskovitis@...fitbricks.com
Subject: Re: [PATCH v2] acpi : acpi_bus_trim() stops removing devices when
 failing to remove the device

On Fri, Oct 26, 2012 at 04:33:49PM +0900, Yasuaki Ishimatsu wrote:
> Hi Greg,
> 
> Sorry for late reply.
> 
> 2012/10/20 2:59, Greg Kroah-Hartman wrote:
> >On Fri, Oct 19, 2012 at 06:29:52AM +0200, Rafael J. Wysocki wrote:
> >>On Thursday 11 of October 2012 19:12:28 Yasuaki Ishimatsu wrote:
> >>>acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error
> >>>number. But acpi_bus_remove() cannot return error number correctly.
> >>>acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if
> >>>device cannot be removed correctly, acpi_bus_trim() ignores and continues to
> >>>remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing
> >>>devices. Therefore acpi_bus_hot_remove_device() can send "_EJ0" to firmware,
> >>>even if the device is running on the system. In this case, the system cannot
> >>>work well.
> >>>
> >>>Vasilis hit the bug at memory hotplug and reported it as follow:
> >>>https://lkml.org/lkml/2012/9/26/318
> >>>
> >>>So acpi_bus_trim() should check whether device was removed or not correctly.
> >>>The patch adds error check into some functions to remove the device.
> >>>
> >>>Applying the patch, acpi_bus_trim() stops removing devices when failing
> >>>to remove the device. But I think there is no impact with the
> >>>exceptionof CPU and Memory hotplug path. Because other device also fails
> >>>but the fail is an irregular case like device is NULL.
> >>>
> >>>v1->v2
> >>>- add a rollback for reinstalling a notify handler.
> >>>
> >>>Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>
> >>
> >>Greg, do you think there may be any problems with the changes in dd.c?
> >
> >Yes, I don't like it.
> >
> >remove should always work, just like the exit call in a module.  It
> >means that the core wants to remove the driver, so it is going to
> >happen, a driver can't refuse it.
> >
> >Which brings me to the larger question, why would this solve anything?
> 
> Now we are developing physical memory hot plug.
> 
> https://lkml.org/lkml/2012/10/23/213
> 
> So if we aplly the patch-set, we can hot remove a physical memory
> by the following way.
> 
> "echo 1 > /sys/bus/acpi/devices/PNP/eject"
> 
> In this case, acpi_bus_hot_remove_device() tries to remove memory
> device by acpi_bus_trim(). But if the memory has irremovable memory,
> memory hot remove fails. And the memory remains in kernel.
> However acpi_bus_trim() cannot notice that memory hot remove fails and
> retruns 0. So acpi_bus_hot_remove_device() continues to remove memory
> devices and sends _EJ0 method to firmware. Thus the memory device cannot
> be used. But the memory remains in kernel yet. So if someone access the
> memory, kernel panic occurs.

Why can't you check to find out if you can do the remove operation
before you enter the driver core asking to actually remove the devices?
That would allow you to "know" if you can do this before having to go
through the whole operation.  What happens if you can complete half of
the removal, and do that, but not the whole thing?  Don't you end up
with half of the memory chunk gone from the system now?

In other words, please solve this at a higher level than the driver
core if at all possible.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ