lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51BB3704.2050708@gmail.com>
Date:	Fri, 14 Jun 2013 23:30:12 +0800
From:	Jiang Liu <liuj97@...il.com>
To:	"Rafael J. Wysocki" <rjw@...k.pl>
CC:	Jiang Liu <jiang.liu@...wei.com>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	Yinghai Lu <yinghai@...nel.org>,
	"Alexander E . Patrakov" <patrakov@...il.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Yijing Wang <wangyijing@...wei.com>, linux-pci@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
	stable@...r.kernel.org
Subject: Re: [BUGFIX 2/9] ACPIPHP: fix device destroying order issue when
 handling dock notification

On 06/14/2013 10:12 PM, Rafael J. Wysocki wrote:
> On Friday, June 14, 2013 09:57:15 PM Jiang Liu wrote:
>> On 06/14/2013 08:23 PM, Rafael J. Wysocki wrote:
>>> On Thursday, June 13, 2013 09:59:44 PM Rafael J. Wysocki wrote:
>>>> On Friday, June 14, 2013 12:32:25 AM Jiang Liu wrote:
>>>>> Current ACPI glue logic expects that physical devices are destroyed
>>>>> before destroying companion ACPI devices, otherwise it will break the
>>>>> ACPI unbind logic and cause following warning messages:
>>>>> [  185.026073] usb usb5: Oops, 'acpi_handle' corrupt
>>>>> [  185.035150] pci 0000:1b:00.0: Oops, 'acpi_handle' corrupt
>>>>> [  185.035515] pci 0000:18:02.0: Oops, 'acpi_handle' corrupt
>>>>> [  180.013656]  port1: Oops, 'acpi_handle' corrupt
>>>>> Please refer to https://bugzilla.kernel.org/attachment.cgi?id=104321
>>>>> for full log message.
>>>>
>>>> So my question is, did we have this problem before commit 3b63aaa70e1?
>>>>
>>>> If we did, then when did it start?  Or was it present forever?
>>>>
>>>>> Above warning messages are caused by following scenario:
>>>>> 1) acpi_dock_notifier_call() queues a task (T1) onto kacpi_hotplug_wq
>>>>> 2) kacpi_hotplug_wq handles T1, which invokes acpi_dock_deferred_cb()
>>>>>    ->dock_notify()-> handle_eject_request()->hotplug_dock_devices()
>>>>> 3) hotplug_dock_devices() first invokes registered hotplug callbacks to
>>>>>    destroy physical devices, then destroys all affected ACPI devices.
>>>>>    Everything seems perfect until now. But the acpiphp dock notification
>>>>>    handler will queue another task (T2) onto kacpi_hotplug_wq to really
>>>>>    destroy affected physical devices.
>>>>
>>>> Would not the solution be to modify it so that it didn't spawn the other
>>>> task (T2), but removed the affected physical devices synchronously?
>>>>
>>>>> 4) kacpi_hotplug_wq finishes T1, and all affected ACPI devices have
>>>>>    been destroyed.
>>>>> 5) kacpi_hotplug_wq handles T2, which destroys all affected physical
>>>>>    devices.
>>>>>
>>>>> So it breaks ACPI glue logic's expection because ACPI devices are destroyed
>>>>> in step 3 and physical devices are destroyed in step 5.
>>>>>
>>>>> Signed-off-by: Jiang Liu <jiang.liu@...wei.com>
>>>>> Reported-by: Alexander E. Patrakov <patrakov@...il.com>
>>>>> Cc: Bjorn Helgaas <bhelgaas@...gle.com>
>>>>> Cc: Yinghai Lu <yinghai@...nel.org>
>>>>> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
>>>>> Cc: linux-pci@...r.kernel.org
>>>>> Cc: linux-kernel@...r.kernel.org
>>>>> Cc: stable@...r.kernel.org
>>>>> ---
>>>>> Hi Bjorn and Rafael,
>>>>>      The recursive lock changes haven't been tested yet, need help
>>>>> from Alexander for testing.
>>>>
>>>> Well, let's just say I'm not a fan of recursive locks.  Is that unavoidable
>>>> here?
>>>
>>> What about the appended patch (on top of [1/9], untested)?
>>>
>>> Rafael
>> It should have similar effect as patch 2/9, and it will encounter the
>> same deadlock scenario as 2/9 too.
> 
> And why exactly?
> 
> I'm looking at acpiphp_disable_slot() and I'm not seeing where the
> problematic lock is taken.  Similarly for power_off_slot().
> 
> It should take the ACPI scan lock, but that's a different matter.
> 
> Thanks,
> Rafael
The deadlock scenario is the same:
        hotplug_dock_devices()
                mutex_lock(&ds->hp_lock)
                        dd->ops->handler()
				destroy pci bus
                                	unregister_hotplug_dock_device()
                                        	mutex_lock(&ds->hp_lock)


> 
> 
>>> ---
>>>  drivers/pci/hotplug/acpiphp_glue.c |   13 ++++++++++++-
>>>  1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
>>> ===================================================================
>>> --- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
>>> +++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
>>> @@ -145,9 +145,20 @@ static int post_dock_fixups(struct notif
>>>  	return NOTIFY_OK;
>>>  }
>>>  
>>> +static void handle_dock_event_func(acpi_handle handle, u32 event, void *context)
>>> +{
>>> +	if (event == ACPI_NOTIFY_EJECT_REQUEST) {
>>> +		struct acpiphp_func *func = context;
>>> +
>>> +		if (!acpiphp_disable_slot(func->slot))
>>> +			acpiphp_eject_slot(func->slot);
>>> +	} else {
>>> +		handle_hotplug_event_func(handle, event, context);
>>> +	}
>>> +}
>>>  
>>>  static const struct acpi_dock_ops acpiphp_dock_ops = {
>>> -	.handler = handle_hotplug_event_func,
>>> +	.handler = handle_dock_event_func,
>>>  };
>>>  
>>>  /* Check whether the PCI device is managed by native PCIe hotplug driver */
>>>
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ