[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51C07E92.6090309@gmail.com>
Date: Tue, 18 Jun 2013 23:36:50 +0800
From: Jiang Liu <liuj97@...il.com>
To: "Rafael J. Wysocki" <rjw@...k.pl>
CC: Bjorn Helgaas <bhelgaas@...gle.com>,
Yinghai Lu <yinghai@...nel.org>,
"Alexander E . Patrakov" <patrakov@...il.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Yijing Wang <wangyijing@...wei.com>,
linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, Len Brown <lenb@...nel.org>,
stable@...r.kernel.org, Jiang Liu <jiang.liu@...wei.com>
Subject: Re: [BUGFIX v2 2/4] ACPI, DOCK: resolve possible deadlock scenarios
On 06/17/2013 07:39 PM, Rafael J. Wysocki wrote:
> On Monday, June 17, 2013 01:01:51 AM Jiang Liu wrote:
>> On 06/16/2013 05:20 AM, Rafael J. Wysocki wrote:
>>> On Saturday, June 15, 2013 10:17:42 PM Rafael J. Wysocki wrote:
>>>> On Saturday, June 15, 2013 09:44:28 AM Jiang Liu wrote:
>> [...]
>>>> When it returns from unregister_hotplug_dock_device(), nothing prevents it
>>>> from accessing whatever it wants, because ds->hp_lock is not used outside
>>>> of the add/del and hotplug_dock_devices(). So, the actual role of
>>>> ds->hp_lock (not the one that it is supposed to play, but the real one)
>>>> is to prevent addition/deletion from happening when hotplug_dock_devices()
>>>> is running. [Yes, it does protect the list, but since the list is in fact
>>>> unnecessary, that doesn't matter.]
>>>>
>>>>> If we simply use a flag to mark presence of registered callback, we
>>>>> can't achieve the second goal.
>>>>
>>>> I don't mean using the flag *alone*.
>>>>
>>>>> Take the sony laptop as an example. It has several PCI
>>>>> hotplug
>>>>> slot associated with the dock station:
>>>>> [ 28.829316] acpiphp_glue: _handle_hotplug_event_func: Bus check
>>>>> notify on \_SB_.PCI0.RP07.LPMB
>>>>> [ 30.174964] acpiphp_glue: _handle_hotplug_event_func: Bus check
>>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM0
>>>>> [ 30.174973] acpiphp_glue: _handle_hotplug_event_func: Bus check
>>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM1
>>>>> [ 30.174979] acpiphp_glue: _handle_hotplug_event_func: Bus check
>>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2
>>>>> [ 30.174985] acpiphp_glue: _handle_hotplug_event_func: Bus check
>>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR0.GFXA
>>>>> [ 30.175020] acpiphp_glue: _handle_hotplug_event_func: Bus check
>>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR0.GHDA
>>>>> [ 30.175040] acpiphp_glue: _handle_hotplug_event_func: Bus check
>>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR1.LPCI.LPC0.DLAN
>>>>> [ 30.175050] acpiphp_glue: _handle_hotplug_event_func: Bus check
>>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR1.LPCI.LPC1.DODD
>>>>> [ 30.175060] acpiphp_glue: _handle_hotplug_event_func: Bus check
>>>>> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR1.LPCI.LPC2.DUSB
>>>>>
>>>>> So it still has some race windows if we undock the station while
>>>>> repeatedly rescanning/removing
>>>>> the PCI bus for \_SB_.PCI0.RP07.LPMB.LPM0 through sysfs interfaces.
>>>
>>> Which sysfs interfaces do you mean, by the way?
>>>
>>> If you mean "eject", then it takes acpi_scan_lock and hotplug_dock_devices()
>>> should always be run under acpi_scan_lock too. It isn't at the moment,t
>>> because write_undock() doesn't take acpi_scan_lock(), but this is an obvious
>>> bug (so I'm going to send a patch to fix it in a while).
>>>
>>> With that bug fixed, the possible race between acpi_eject_store() and
>>> hotplug_dock_devices() should be prevented from happening, so perhaps we're
>>> worrying about something that cannot happen?
>> Hi Rafael,
>> I mean the "remove" method of each PCI device, and the "power" method
>> of PCI hotplug slot here.
>> These methods may be used to remove P2P bridges with associated ACPIPHP
>> hotplug slots, which in turn will cause invoking of
>> unregister_hotplug_dock_device().
>> So theoretical we may trigger the bug by undocking while repeatedly
>> adding/removing P2P bridges with ACPIPHP hotplug slot through PCI
>> "rescan" and "remove" sysfs interface,
>
> Why don't we make these things take acpi_scan_lock upfront, then?
Hi Rafael,
Seems we can't rely on acpi_scan_lock here, it may cause another
deadlock scenario:
1) thread 1 acquired the acpi_scan_lock and tries to destroy all sysfs
interfaces for PCI devices.
2) thread 2 opens a PCI sysfs which then tries to acquire the
acpi_scan_lock.
Regards!
Gerry
>
> Rafael
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists