lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 18 Oct 2011 20:34:16 +0800
From:	"canquan.shen" <shencanquan@...wei.com>
To:	Bjorn Helgaas <bhelgaas@...gle.com>
Cc:	Chen Gong <gong.chen@...ux.intel.com>, len.brown@...el.com,
	"shemminger@...tta.com" <shemminger@...tta.com>,
	"yakui.zhao@...el.com" <yakui.zhao@...el.com>,
	"xiaowei.yang@...wei.com" <xiaowei.yang@...wei.com>,
	hanweidong <hanweidong@...wei.com>, linqiangmin@...wei.com,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	khalid.aziz@...com
Subject: Re: [PATCH v5] acpi: Fix CPU hot removal problem

Hi Len and Chen Gong,
    What's you opinion on this patch?

    Thanks.

---
canquan.shen
2011-10-18

On 2011/9/30 13:20, Bjorn Helgaas wrote:
> On Wed, Sep 28, 2011 at 9:41 PM, canquan.shen<shencanquan@...wei.com>  wrote:
>>
>> On 2011/9/29 11:14, Chen Gong wrote:
>>>
>>> 于 2011/9/24 14:10, canquan.shen 写道:
>>>>
>>>> We run linux as a guest in Xen environment. When we used the xen tools
>>>> (xm vcpu-set<n>) to hot add and remove vcpu to and from the guest, we
>>>> encountered the failure on vcpu removal. We found the reason is that it
>>>> did't go to really remove cpu in the cpu removal code path.
>>>>
>>>> This patch adds acpi_bus_hot_remove_device in
>>>> acpi_process_hotplug_notify to
>>>> fix this issue. With this patch, it works fine for us.
>>>>
>>>> Signed-off-by: Canquan Shen<shencanquan@...wei.com>
>>>> ---
>>>> drivers/acpi/processor_driver.c | 13 +------------
>>>> drivers/acpi/scan.c | 4 ++--
>>>> include/acpi/acpi_bus.h | 1 +
>>>> 3 files changed, 4 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/drivers/acpi/processor_driver.c
>>>> b/drivers/acpi/processor_driver.c
>>>> index a4e0f1b..8429688 100644
>>>> --- a/drivers/acpi/processor_driver.c
>>>> +++ b/drivers/acpi/processor_driver.c
>>>> @@ -665,18 +665,7 @@ static void
>>>> acpi_processor_hotplug_notify(acpi_handle handle,
>>>> case ACPI_NOTIFY_EJECT_REQUEST:
>>>> ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>>>> "received ACPI_NOTIFY_EJECT_REQUEST\n"));
>>>> -
>>>> - if (acpi_bus_get_device(handle,&device)) {
>>>> - printk(KERN_ERR PREFIX
>>>> - "Device don't exist, dropping EJECT\n");
>>>> - break;
>>>> - }
>>>> - pr = acpi_driver_data(device);
>>>> - if (!pr) {
>>>> - printk(KERN_ERR PREFIX
>>>> - "Driver data is NULL, dropping EJECT\n");
>>>> - return;
>>>> - }
>>>> + acpi_bus_hot_remove_device(handle);
>>>
>>> As the description in __acpi_os_execute(in acpi_os_hotplug_execute),
>>> /*
>>> * We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq
>>> * because the hotplug code may call driver .remove() functions,
>>> * which invoke flush_scheduled_work/acpi_os_wait_events_complete
>>> * to flush these workqueues.
>>> */
>>> If so, why not using following call:
>>>
>>> acpi_os_hotplug_execute(acpi_bus_hot_remove_device, acpi_device->handle);
>>>
>>>
>>
>> It is ok if using the acpi_os_hotplug_execute. but it is complex and more time for removal cpu because it is add to queue and some time the work will be called.
>> I think that it is clear to call directly acpi_bus_hot_remove_device function in acpi_processor_hotplug_notify.
>
> Chen, you're right that the CPU hot-remove notifier is running in a
> workqueue, we are proposing  that the notifier call
> acpi_bus_hot_remove_device(), which calls the .remove() method, and
> that can cause a deadlock if .remove() waits for the workqueue to be
> flushed.
>
> The usual way this deadlock happens is when the .remove() method uses
> acpi_remove_notify_handler().  But the processor_driver .remove()
> method doesn't do anything with notify handler registration; the
> handler is registered/unregistered by the module init/exit functions.
> So I don't think we need to use acpi_os_hotplug_execute() in this case
> because I don't think there's a risk of deadlock.
>
> However, if new CPU devices appear in the namespace after module-init,
> I don't think processor_driver will handle them correctly.  This looks
> like just another artifact of our screwed-up ACPI hotplug handling.
>
> Bjorn
>
> .
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ