[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <12669324.bPBpI0mOPP@vostro.rjw.lan>
Date: Thu, 03 Sep 2015 02:58:16 +0200
From: "Rafael J. Wysocki" <rjw@...ysocki.net>
To: Tejun Heo <tj@...nel.org>
Cc: Jiang Liu <jiang.liu@...ux.intel.com>,
linux hotplug mailing <linux-hotplug@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
ACPI Devel Maling List <linux-acpi@...r.kernel.org>
Subject: Re: Possible deadlock related to CPU hotplug and kernfs
On Wednesday, September 02, 2015 12:14:45 PM Tejun Heo wrote:
> On Tue, Sep 01, 2015 at 03:12:34PM +0800, Jiang Liu wrote:
> > Hi Rafael and Tejun,
> > When running CPU hotplug tests, it triggers an lockdep warning
> > as follow. The two possible deadlock paths are:
> > 1) echo x > /sys/devices/system/cpu/cpux/online
> > ->kernfs_fop_write()
> > ->kernfs_get_active()
> > 1.a) ->rwsem_acquire_read(&kn->dep_map, 0, 1, _RET_IP_);
> > ->cpu_up()
> > 1.b) ->cpu_hotplug_begin()[lock_map_acquire(&cpu_hotplug.dep_map)]
> > 2) hardware triggers hotplug evetns
> > ->acpi_device_hotplug()
> > ->acpi_processor_remove()
> > 2.a) ->cpu_hotplug_begin()[lock_map_acquire(&cpu_hotplug.dep_map)]
> > ->unregister_cpu()
> > ->device_del()
> > ->kernfs_remove_by_name_ns()
> > ->__kernfs_remove()
> > ->kernfs_drain()
> > 2.b) ->rwsem_acquire(&kn->dep_map, 0, 0, _RET_IP_)
> >
> > So there is a possible deadlock scenario among 1.a, 1.b, 2.a and 2.b.
> > I'm not familiar with kernfs, so could you please help to comment:
> > 1) whether is a real deadlock issue?
>
> Yes, it seems to be. It's highly unlikely but still possible.
Hmm.
So acpi_device_hotplug() calls lock_device_hotplug() which simply
acquires device_hotplug_lock. It is held throughout the entire
hot-add/hot-remove code path.
Witing anything to /sys/devices/system/cpu/cpux/online goes through
online_store() in drivers/base/core.c and that does
lock_device_hotplug_sysfs() which then attempts to acquire
device_hotplug_lock using mutex_trylock(). And it only calls
either device_online() or device_offline() if it ends up with the
lock held.
Quite frankly, I don't see how these particular two code paths can
deadlock in any way.
So either a third code path is involved which is not executed
under device_hotplug_lock, or lockdep needs to be told to actually
take device_hotplug_lock into account in this case IMO.
> > 2) any recommended way to get it fixed?
>
> This usually happens with "delete" files and it's worked around by
> performing special self-removal on the file before actually removing
> the device. I suppose on/offline files would need to turn off
> active_protection with kernfs_[un]break_active_protection() which
> should probably grow sysfs and device layer wrappers.
Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists