lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5256328.ZVnrTeLrH1@vostro.rjw.lan>
Date:	Thu, 12 Feb 2015 23:27:56 +0100
From:	"Rafael J. Wysocki" <rjw@...ysocki.net>
To:	KY Srinivasan <kys@...rosoft.com>
Cc:	Vitaly Kuznetsov <vkuznets@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Haiyang Zhang <haiyangz@...rosoft.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>,
	Tang Chen <tangchen@...fujitsu.com>,
	Vlastimil Babka <vbabka@...e.cz>,
	David Rientjes <rientjes@...gle.com>,
	Fabian Frederick <fabf@...net.be>,
	Zhang Zhen <zhenzhang.zhang@...wei.com>,
	Vladimir Davydov <vdavydov@...allels.com>,
	Wang Nan <wangnan0@...wei.com>,
	"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH RESEND 0/3] memory_hotplug: hyperv: fix deadlock between memory adding and onlining

On Thursday, February 12, 2015 10:01:48 PM KY Srinivasan wrote:
> 
> > -----Original Message-----
> > From: Rafael J. Wysocki [mailto:rjw@...ysocki.net]
> > Sent: Thursday, February 12, 2015 2:13 PM
> > To: KY Srinivasan
> > Cc: Vitaly Kuznetsov; linux-kernel@...r.kernel.org; Greg Kroah-Hartman;
> > Haiyang Zhang; Andrew Morton; Yasuaki Ishimatsu; Tang Chen; Vlastimil
> > Babka; David Rientjes; Fabian Frederick; Zhang Zhen; Vladimir Davydov;
> > Wang Nan; devel@...uxdriverproject.org; linux-mm@...ck.org
> > Subject: Re: [PATCH RESEND 0/3] memory_hotplug: hyperv: fix deadlock
> > between memory adding and onlining
> > 
> > On Thursday, February 12, 2015 03:38:42 PM KY Srinivasan wrote:
> > >
> > > > -----Original Message-----
> > > > From: Vitaly Kuznetsov [mailto:vkuznets@...hat.com]
> > > > Sent: Thursday, February 12, 2015 2:24 AM
> > > > To: linux-kernel@...r.kernel.org
> > > > Cc: Greg Kroah-Hartman; KY Srinivasan; Haiyang Zhang; Andrew Morton;
> > > > Yasuaki Ishimatsu; Tang Chen; Vlastimil Babka; David Rientjes;
> > > > Fabian Frederick; Zhang Zhen; Vladimir Davydov; Wang Nan; Rafael J.
> > > > Wysocki; devel@...uxdriverproject.org; linux-mm@...ck.org
> > > > Subject: [PATCH RESEND 0/3] memory_hotplug: hyperv: fix deadlock
> > > > between memory adding and onlining
> > > >
> > > > RESEND (with no changes) because Rafael J. Wysocki was missing in
> > > > recepients.
> > > >
> > > > If newly added memory is brought online with e.g. udev rule:
> > > > SUBSYSTEM=="memory", ACTION=="add", ATTR{state}="online"
> > > > the following deadlock is observed (and easily reproducable):
> > > >
> > > > First participant, worker thread doing add_memory():
> > > >
> > > > [  724.948846] kworker/0:1     D ffff88000412f9c8 13248    27      2
> > 0x00000000
> > > > [  724.973543] Workqueue: events hot_add_req [hv_balloon] [
> > > > 724.991736]
> > > > ffff88000412f9c8 0000000000000000 ffff88003fa1dc30 00000000000151c0
> > > > [ 725.019725]  0000000000000246 ffff88000412ffd8 00000000000151c0
> > > > ffff88003a77a4e0 [  725.046486]  ffff88003fa1dc30 00000001032a6000
> > > > ffff88003a7ca838 ffff88003a7ca898 [  725.072969] Call Trace:
> > > > [  725.082690]  [<ffffffff81aac0a9>]
> > > > schedule_preempt_disabled+0x29/0x70
> > > > [  725.103799]  [<ffffffff81aae33b>] mutex_lock_nested+0x14b/0x470 [
> > > > 725.122367]  [<ffffffff815ed773>] ? device_attach+0x23/0xb0 [
> > > > 725.140992] [<ffffffff815ed773>] device_attach+0x23/0xb0 [
> > > > 725.159131] [<ffffffff815ecba0>] bus_probe_device+0xb0/0xe0 [
> > > > 725.177055] [<ffffffff815ea693>] device_add+0x443/0x650 [
> > > > 725.195558] [<ffffffff815ea8be>] device_register+0x1e/0x30 [
> > > > 725.213133] [<ffffffff81601790>] init_memory_block+0xd0/0xf0 [
> > > > 725.231533] [<ffffffff816018f1>] register_new_memory+0xb1/0xd0 [
> > > > 725.250769] [<ffffffff81a961cf>] __add_pages+0x13f/0x250 [
> > > > 725.269642] [<ffffffff81063770>] ? arch_add_memory+0x70/0xf0 [
> > > > 725.288764] [<ffffffff81063770>] arch_add_memory+0x70/0xf0 [
> > > > 725.306117] [<ffffffff81a95f8f>] add_memory+0xef/0x1f0 [
> > > > 725.322466] [<ffffffffa00293af>] hot_add_req+0x33f/0xf90
> > > > [hv_balloon] [  725.342777] [<ffffffff8109509f>]
> > > > process_one_work+0x1df/0x4e0 [  725.361459] [<ffffffff8109502d>] ?
> > > > process_one_work+0x16d/0x4e0 [  725.380390] [<ffffffff810954bb>]
> > > > worker_thread+0x11b/0x450 [  725.397684] [<ffffffff810953a0>] ?
> > > > process_one_work+0x4e0/0x4e0 [  725.416533] [<ffffffff8109ac33>]
> > > > kthread+0xf3/0x110 [  725.433372]  [<ffffffff8109ab40>] ?
> > > > kthread_create_on_node+0x240/0x240
> > > > [  725.453749]  [<ffffffff81ab1dfc>] ret_from_fork+0x7c/0xb0 [
> > > > 725.470994] [<ffffffff8109ab40>] ?
> > > > kthread_create_on_node+0x240/0x240
> > > > [  725.491469] 6 locks held by kworker/0:1/27:
> > > > [  725.505037]  #0:  ("events"){......}, at: [<ffffffff8109502d>]
> > > > process_one_work+0x16d/0x4e0 [  725.533370]  #1:
> > > > ((&dm_device.ha_wrk.wrk)){......}, at: [<ffffffff8109502d>]
> > > > process_one_work+0x16d/0x4e0 [  725.565580]  #2:
> > > > (mem_hotplug.lock){......}, at: [<ffffffff811e6525>]
> > > > mem_hotplug_begin+0x5/0x80 [  725.594369]  #3:
> > > > (mem_hotplug.lock#2){......}, at: [<ffffffff811e656f>]
> > > > mem_hotplug_begin+0x4f/0x80 [  725.628554]  #4:
> > > > (mem_sysfs_mutex){......}, at: [<ffffffff81601873>]
> > > > register_new_memory+0x33/0xd0 [  725.658519]  #5:
> > > > (&dev->mutex){......},
> > > > at: [<ffffffff815ed773>] device_attach+0x23/0xb0
> > > >
> > > > Second participant, udev:
> > > >
> > > > [  725.750889] systemd-udevd   D ffff88003b94fc68 14016   888    530
> > > > 0x00000004
> > > > [  725.773767]  ffff88003b94fc68 0000000000000000 ffff8800034949c0
> > > > 00000000000151c0 [  725.798332]  ffffffff8210d980 ffff88003b94ffd8
> > > > 00000000000151c0 ffff880037a69270 [  725.822841]  ffff8800034949c0
> > > > 0000000100000001 ffff8800034949c0 ffffffff81ff2b48 [  725.849184] Call
> > Trace:
> > > > [  725.858987]  [<ffffffff81aac0a9>]
> > > > schedule_preempt_disabled+0x29/0x70
> > > > [  725.879231]  [<ffffffff81aae33b>] mutex_lock_nested+0x14b/0x470 [
> > > > 725.897860]  [<ffffffff811e656f>] ? mem_hotplug_begin+0x4f/0x80 [
> > > > 725.916698]  [<ffffffff811e656f>] mem_hotplug_begin+0x4f/0x80 [
> > > > 725.935064]  [<ffffffff811e6525>] ? mem_hotplug_begin+0x5/0x80 [
> > > > 725.953464]  [<ffffffff81a9631b>] online_pages+0x3b/0x520 [
> > > > 725.971542] [<ffffffff815eb0b3>] ? device_online+0x23/0xa0 [
> > > > 725.989207] [<ffffffff81601524>] memory_subsys_online+0x64/0xc0 [
> > > > 726.008513] [<ffffffff815eb0fd>] device_online+0x6d/0xa0 [
> > > > 726.025579] [<ffffffff816012eb>] store_mem_state+0x5b/0xe0 [
> > > > 726.043400] [<ffffffff815e8258>] dev_attr_store+0x18/0x30 [
> > > > 726.060506] [<ffffffff8127a808>] sysfs_kf_write+0x48/0x60 [
> > > > 726.077940] [<ffffffff81279d1b>] kernfs_fop_write+0x13b/0x1a0 [
> > > > 726.099416] [<ffffffff811f9f67>] vfs_write+0xb7/0x1f0 [  726.115748]
> > > > [<ffffffff811fabf8>]
> > > > SyS_write+0x58/0xd0 [  726.131933]  [<ffffffff81ab1ea9>]
> > > > system_call_fastpath+0x12/0x17 [  726.150691] 7 locks held by
> > > > systemd-
> > > > udevd/888:
> > > > [  726.165044]  #0:  (sb_writers#3){......}, at:
> > > > [<ffffffff811fa063>]
> > > > vfs_write+0x1b3/0x1f0 [  726.192422]  #1:  (&of->mutex){......}, at:
> > > > [<ffffffff81279c46>] kernfs_fop_write+0x66/0x1a0 [  726.220289]  #2:
> > > > (s_active#60){......}, at: [<ffffffff81279c4e>]
> > > > kernfs_fop_write+0x6e/0x1a0 [ 726.249382]  #3:
> > > > (device_hotplug_lock){......}, at: [<ffffffff815e9c15>]
> > > > lock_device_hotplug_sysfs+0x15/0x50
> > > > [  726.281901]  #4:  (&dev->mutex){......}, at: [<ffffffff815eb0b3>]
> > > > device_online+0x23/0xa0 [  726.308619]  #5:  (mem_hotplug.lock){......},
> > at:
> > > > [<ffffffff811e6525>] mem_hotplug_begin+0x5/0x80 [  726.337994]  #6:
> > > > (mem_hotplug.lock#2){......}, at: [<ffffffff811e656f>]
> > > > mem_hotplug_begin+0x4f/0x80
> > > >
> > > > In short: onlining grabs device lock and then tries to do
> > > > mem_hotplug_begin() while add_memory() is between
> > > > mem_hotplug_begin() and mem_hotplug_done() and it tries grabbing
> > > > device lock.
> > > >
> > > > To my understanding ACPI memory hotplug doesn't have the same issue
> > > > as device_hotplug_lock is being grabbed when the ACPI device is added.
> > > >
> > > > Solve the issue by grabbing device_hotplug_lock before doing
> > > > add_memory(). If we do that, lock_device_hotplug_sysfs() will cause
> > > > syscall retry which will eventually succeed. To support the change
> > > > we need to export lock_device_hotplug/ unlock_device_hotplug. This
> > > > approach can be completely wrong though.
> > >
> > > This issue was first discovered by Andy Whitcroft:
> > > https://lkml.org/lkml/2014/3/14/451
> > > I had sent patches based on Andy's analysis that did not affect the
> > > users of the kernel hot-add memory APIs:
> > > https://lkml.org/lkml/2014/12/2/662
> > >
> > > This patch puts the burden where it needs to be and can address the issue
> > for all clients.
> > 
> > That seems to mean that this series is not needed.  Is that correct?
> 
> This patch was never committed upstream and so the issue still is there.

Well, I'm not sure what to do now to be honest.

Is this series regarded as the right way to address the problem that
everybody is comfortable with?  Or is it still under discussion?

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ