[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8974656.5GmDyxHW7F@vostro.rjw.lan>
Date: Sat, 15 Jun 2013 22:17:42 +0200
From: "Rafael J. Wysocki" <rjw@...k.pl>
To: Jiang Liu <liuj97@...il.com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>,
Yinghai Lu <yinghai@...nel.org>,
"Alexander E . Patrakov" <patrakov@...il.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Yijing Wang <wangyijing@...wei.com>,
linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, Len Brown <lenb@...nel.org>,
stable@...r.kernel.org, Jiang Liu <jiang.liu@...wei.com>
Subject: Re: [BUGFIX v2 2/4] ACPI, DOCK: resolve possible deadlock scenarios
On Saturday, June 15, 2013 09:44:28 AM Jiang Liu wrote:
> On Sat 15 Jun 2013 06:21:02 AM CST, Rafael J. Wysocki wrote:
> > On Saturday, June 15, 2013 03:27:59 AM Jiang Liu wrote:
> >> This is a preparation for next patch to avoid breaking bisecting.
> >> If next patch is applied without this one, it will cause deadlock
> >> as below:
> >>
> >> Case 1:
> >> [ 31.015593] Possible unsafe locking scenario:
> >>
> >> [ 31.018350] CPU0 CPU1
> >> [ 31.019691] ---- ----
> >> [ 31.021002] lock(&dock_station->hp_lock);
> >> [ 31.022327] lock(&slot->crit_sect);
> >> [ 31.023650] lock(&dock_station->hp_lock);
> >> [ 31.025010] lock(&slot->crit_sect);
> >> [ 31.026342]
> >>
> >> Case 2:
> >> hotplug_dock_devices()
> >> mutex_lock(&ds->hp_lock)
> >> dd->ops->handler()
> >> register_hotplug_dock_device()
> >> mutex_lock(&ds->hp_lock)
> >> [ 34.316570] [ INFO: possible recursive locking detected ]
> >> [ 34.316573] 3.10.0-rc4 #6 Tainted: G C
> >> [ 34.316575] ---------------------------------------------
> >> [ 34.316577] kworker/0:0/4 is trying to acquire lock:
> >> [ 34.316579] (&dock_station->hp_lock){+.+.+.}, at:
> >> [<ffffffff813c766b>] register_hotplug_dock_device+0x6a/0xbf
> >> [ 34.316588]
> >> but task is already holding lock:
> >> [ 34.316590] (&dock_station->hp_lock){+.+.+.}, at:
> >> [<ffffffff813c7270>] hotplug_dock_devices+0x2c/0xda
> >> [ 34.316595]
> >> other info that might help us debug this:
> >> [ 34.316597] Possible unsafe locking scenario:
> >>
> >> [ 34.316599] CPU0
> >> [ 34.316601] ----
> >> [ 34.316602] lock(&dock_station->hp_lock);
> >> [ 34.316605] lock(&dock_station->hp_lock);
> >> [ 34.316608]
> >> *** DEADLOCK ***
> >>
> >> So fix this deadlock by not taking ds->hp_lock in function
> >> register_hotplug_dock_device(). This patch also fixes a possible
> >> race conditions in function dock_event() because previously it
> >> accesses ds->hotplug_devices list without holding ds->hp_lock.
> >>
> >> Signed-off-by: Jiang Liu <jiang.liu@...wei.com>
> >> Cc: Len Brown <lenb@...nel.org>
> >> Cc: "Rafael J. Wysocki" <rjw@...k.pl>
> >> Cc: Bjorn Helgaas <bhelgaas@...gle.com>
> >> Cc: Yinghai Lu <yinghai@...nel.org>
> >> Cc: Yijing Wang <wangyijing@...wei.com>
> >> Cc: linux-acpi@...r.kernel.org
> >> Cc: linux-kernel@...r.kernel.org
> >> Cc: linux-pci@...r.kernel.org
> >> Cc: <stable@...r.kernel.org> # 3.8+
> >> ---
> >> drivers/acpi/dock.c | 109 ++++++++++++++++++++++---------------
> >> drivers/pci/hotplug/acpiphp_glue.c | 15 +++++
> >> include/acpi/acpi_drivers.h | 2 +
> >> 3 files changed, 82 insertions(+), 44 deletions(-)
> >>
> >> diff --git a/drivers/acpi/dock.c b/drivers/acpi/dock.c
> >> index 02b0563..602bce5 100644
> >> --- a/drivers/acpi/dock.c
> >> +++ b/drivers/acpi/dock.c
> >> @@ -66,7 +66,7 @@ struct dock_station {
> >> spinlock_t dd_lock;
> >> struct mutex hp_lock;
> >> struct list_head dependent_devices;
> >> - struct list_head hotplug_devices;
> >> + struct klist hotplug_devices;
> >>
> >> struct list_head sibling;
> >> struct platform_device *dock_device;
> >> @@ -76,12 +76,18 @@ static int dock_station_count;
> >>
> >> struct dock_dependent_device {
> >> struct list_head list;
> >> - struct list_head hotplug_list;
> >> + acpi_handle handle;
> >> +};
> >> +
> >> +struct dock_hotplug_info {
> >> + struct klist_node node;
> >> acpi_handle handle;
> >> const struct acpi_dock_ops *ops;
> >> void *context;
> >> };
> >
> > Can we please relax a bit and possibly take a step back?
> >
> > So since your last reply to me wasn't particularly helpful, I went through the
> > code in dock.c and acpiphp_glue.c and I simply think that the whole
> > hotplug_list thing is simply redundant.
> >
> > It looks like instead of using it (or the klist in this patch), we can add a
> > "hotlpug_device" flag to dock_dependent_device and set that flag instead of
> > adding dd to hotplug_devices or clear it instead of removing dd from that list.
> >
> > That would allow us to avoid the deadlock, because we wouldn't need the hp_lock
> > any more and perhaps we could make the code simpler instead of making it more
> > complex.
> >
> > How does that sound?
> >
> > Rafael
> Hi Rafael,
> Thanks for comments! It would be great if we could kill the
> hotplug_devices
> list so thing gets simple. But there are still some special cases:(
>
> As you have mentioned, ds->hp_lock is used to make both addition and
> removal
> of hotplug devices wait for us to complete walking ds->hotplug_devices.
> So it acts as two roles:
> 1) protect the hotplug_devices list,
> 2) serialize unregister_hotplug_dock_device() and
> hotplug_dock_devices() so
> the dock driver doesn't access registered handler and associated data
> structure
> once returing from unregister_hotplug_dock_device().
When it returns from unregister_hotplug_dock_device(), nothing prevents it
from accessing whatever it wants, because ds->hp_lock is not used outside
of the add/del and hotplug_dock_devices(). So, the actual role of
ds->hp_lock (not the one that it is supposed to play, but the real one)
is to prevent addition/deletion from happening when hotplug_dock_devices()
is running. [Yes, it does protect the list, but since the list is in fact
unnecessary, that doesn't matter.]
> If we simply use a flag to mark presence of registered callback, we
> can't achieve the second goal.
I don't mean using the flag *alone*.
> Take the sony laptop as an example. It has several PCI
> hotplug
> slot associated with the dock station:
> [ 28.829316] acpiphp_glue: _handle_hotplug_event_func: Bus check
> notify on \_SB_.PCI0.RP07.LPMB
> [ 30.174964] acpiphp_glue: _handle_hotplug_event_func: Bus check
> notify on \_SB_.PCI0.RP07.LPMB.LPM0
> [ 30.174973] acpiphp_glue: _handle_hotplug_event_func: Bus check
> notify on \_SB_.PCI0.RP07.LPMB.LPM1
> [ 30.174979] acpiphp_glue: _handle_hotplug_event_func: Bus check
> notify on \_SB_.PCI0.RP07.LPMB.LPM2
> [ 30.174985] acpiphp_glue: _handle_hotplug_event_func: Bus check
> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR0.GFXA
> [ 30.175020] acpiphp_glue: _handle_hotplug_event_func: Bus check
> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR0.GHDA
> [ 30.175040] acpiphp_glue: _handle_hotplug_event_func: Bus check
> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR1.LPCI.LPC0.DLAN
> [ 30.175050] acpiphp_glue: _handle_hotplug_event_func: Bus check
> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR1.LPCI.LPC1.DODD
> [ 30.175060] acpiphp_glue: _handle_hotplug_event_func: Bus check
> notify on \_SB_.PCI0.RP07.LPMB.LPM2.LPRI.LPR1.LPCI.LPC2.DUSB
>
> So it still has some race windows if we undock the station while
> repeatedly rescanning/removing
> the PCI bus for \_SB_.PCI0.RP07.LPMB.LPM0 through sysfs interfaces. For
> example, thread 1 is
> handling undocking event, walking the dependent device list and
> invoking registered callback
> handler with associated data. While that, thread 2 may step in to
> unregister the callback for
> \_SB_.PCI0.RP07.LPMB.LPM0. Then thread 1 may cause access-after-free
> issue.
That should be handled in acpiphp_glue instead of dock. What you're trying to
do is to make dock work around synchronization issues in the acpiphp driver.
> The klist patch solves this issue by adding a "put" callback method to
> explicitly notify
> dock client that the dock core has done with previously registered
> handler and associated
> data.
Honestly, don't you think this is overly compilcated?
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists