linux-kernel - Re: latest -git: hibernate: possible circular locking dependency detected

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b647ffbd0808211038h686f9df3ud335efbc46242952@mail.gmail.com>
Date:	Thu, 21 Aug 2008 19:38:23 +0200
From:	"Dmitry Adamushko" <dmitry.adamushko@...il.com>
To:	"Vegard Nossum" <vegard.nossum@...il.com>
Cc:	"Oleg Nesterov" <oleg@...sign.ru>,
	"Peter Zijlstra" <a.p.zijlstra@...llo.nl>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	"Max Krasnyanskiy" <maxk@...lcomm.com>,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>
Subject: Re: latest -git: hibernate: possible circular locking dependency detected

Hi,

[ cc: Peter and Oleg ]

heh, my mind might have been also 'hibernated' by the everning but I
still dare to speculate :-)


=======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.27-rc4-00003-ga798564 #28
> -------------------------------------------------------
> events/0/10 is trying to acquire lock:
>  (cpu_add_remove_lock){--..}, at: [<c013bd9f>] cpu_maps_update_begin+0xf/0x20
> but task is already holding lock:
>  (poweroff_work){--..}, at: [<c014ae17>] run_workqueue+0x107/0x200
> which lock already depends on the new lock.
> the existing dependency chain (in reverse order) is:
> -> #2 (poweroff_work){--..}:
>       [<c015f9e6>] validate_chain+0x976/0xe90
>       [<c0160159>] __lock_acquire+0x259/0xa00
>       [<c0160989>] lock_acquire+0x89/0xc0
>       [<c014ae75>] run_workqueue+0x165/0x200
>       [<c014b9bd>] worker_thread+0x7d/0xe0
>       [<c014e252>] kthread+0x42/0x70
>       [<c0105cf3>] kernel_thread_helper+0x7/0x14
>       [<ffffffff>] 0xffffffff
> -> #1 (events){--..}:
>       [<c015f9e6>] validate_chain+0x976/0xe90
>       [<c0160159>] __lock_acquire+0x259/0xa00
>       [<c0160989>] lock_acquire+0x89/0xc0
>       [<c014b09e>] cleanup_workqueue_thread+0x3e/0x70
>       [<c0663bfc>] workqueue_cpu_callback+0x7c/0x130
>       [<c0152f97>] notifier_call_chain+0x37/0x70
>       [<c0153009>] __raw_notifier_call_chain+0x19/0x20
>       [<c015302a>] raw_notifier_call_chain+0x1a/0x20
>       [<c0661e9a>] _cpu_down+0x1ea/0x270
>       [<c013bf18>] disable_nonboot_cpus+0x58/0xe0
>       [<c0168897>] hibernation_snapshot+0x117/0x220
>       [<c0168a80>] hibernate+0xe0/0x180
>       [<c01675ef>] state_store+0xbf/0xd0
>       [<c0375f84>] kobj_attr_store+0x24/0x30
>       [<c01fa4c2>] sysfs_write_file+0xa2/0x100
>       [<c01bbf96>] vfs_write+0x96/0x130
>       [<c01bc4dd>] sys_write+0x3d/0x70
>       [<c0104f3b>] sysenter_do_call+0x12/0x3f

this path is triggered as a result of "echo disk > /sys/power/state"

disable_nonboot_cpus() calls cpu_maps_update_being() which takes
"cpu_add_remove_lock" (lock-1).

If we go down the road cleanup_workqueue_thread() ->
flush_cpu_workqueue() will take "cwq->lock" (lock-2).
So this should be the second lock.


>       [<ffffffff>] 0xffffffff
> -> #0 (cpu_add_remove_lock){--..}:
>       [<c015f5ba>] validate_chain+0x54a/0xe90
>       [<c0160159>] __lock_acquire+0x259/0xa00
>       [<c0160989>] lock_acquire+0x89/0xc0
>       [<c06859cb>] mutex_lock_nested+0xab/0x300
>       [<c013bd9f>] cpu_maps_update_begin+0xf/0x20
>       [<c013bed3>] disable_nonboot_cpus+0x13/0xe0
>       [<c0148990>] kernel_power_off+0x20/0x40
>       [<c016c0a8>] do_poweroff+0x8/0x10
>       [<c014ae7a>] run_workqueue+0x16a/0x200
>       [<c014b9bd>] worker_thread+0x7d/0xe0
>       [<c014e252>] kthread+0x42/0x70
>       [<c0105cf3>] kernel_thread_helper+0x7/0x14

hmm, did you somehow hit "Sysrq + o"?

'cause I don't see any other places (say, with handle_sysrq(k,...)
where "k" migth be 'o') from where do_power_off() might have been
triggered...

however, I think there are 2 problems with handle_poweroff()
[ kernel/power/poweroff.c ]

(1) it doesn't ensure that the 'cpu' it gets via
first_cpu(cpu_online_map) can't disappear (race with cpu_down()) on
the way to schedule_work_on()

[ I pressume, neither generic sysrq nor console layer takes care of
it. They shoudn't of course ]

(2) run_workqueue() [ which in the end calls do_poweroff() ] takes the
"cwq->lock" (which is lock-2 in our terminology)

well, actually it release it before calling "work->fun()" but is the
'lockdep' annotation right here? Peter?

(I admit, I never looked at lockdep and do make assumptions on its syntax here).

The lock-1 will be taken as a result of

then, do_poweroff() -> kernel_power_off() -> disable_nonboot_cpus()

which calls cpu_maps_update_begin() and takes "cpu_add_remove_lock"

and this looks dangerous. Due to the same reason as was before with
the use of get_online_cpus() by workqueue handlers before
CPU_POST_DEAD introduction
(http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3da1c84c00c7e5fa8348336bd8c342f9128b0f14)

I guess, it may deadlock as the lock-1 has been already taken before
calling cleanup_workqueue_thread() -> flush_cpu_workqueue() and
completion of the former chain depends in turn on being able to
acquire the very same lock.

hm?


>       [<ffffffff>] 0xffffffff
> other info that might help us debug this:
> 2 locks held by events/0/10:
>  #0:  (events){--..}, at: [<c014ae17>] run_workqueue+0x107/0x200
>  #1:  (poweroff_work){--..}, at: [<c014ae17>] run_workqueue+0x107/0x200
> stack backtrace:
> Pid: 10, comm: events/0 Not tainted 2.6.27-rc4-00003-ga798564 #28
>  [<c015f029>] print_circular_bug_tail+0x79/0xc0
>  [<c015c193>] ? print_circular_bug_entry+0x43/0x50
>  [<c015f5ba>] validate_chain+0x54a/0xe90
>  [<c015b565>] ? add_lock_to_list+0x45/0xb0
>  [<c039b7dd>] ? acpi_os_acquire_lock+0x8/0xa
>  [<c0160159>] __lock_acquire+0x259/0xa00
>  [<c015b67b>] ? trace_hardirqs_off+0xb/0x10
>  [<c0160989>] lock_acquire+0x89/0xc0
>  [<c013bd9f>] ? cpu_maps_update_begin+0xf/0x20
>  [<c06859cb>] mutex_lock_nested+0xab/0x300
>  [<c013bd9f>] ? cpu_maps_update_begin+0xf/0x20
>  [<c013bd9f>] ? cpu_maps_update_begin+0xf/0x20
>  [<c013bd9f>] cpu_maps_update_begin+0xf/0x20
>  [<c013bed3>] disable_nonboot_cpus+0x13/0xe0
>  [<c0148990>] kernel_power_off+0x20/0x40
>  [<c016c0a8>] do_poweroff+0x8/0x10
>  [<c014ae7a>] run_workqueue+0x16a/0x200
>  [<c014ae17>] ? run_workqueue+0x107/0x200
>  [<c016c0a0>] ? do_poweroff+0x0/0x10
>  [<c014b9bd>] worker_thread+0x7d/0xe0
>  [<c014e560>] ? autoremove_wake_function+0x0/0x50
>  [<c014b940>] ? worker_thread+0x0/0xe0
>  [<c014e252>] kthread+0x42/0x70
>  [<c014e210>] ? kthread+0x0/0x70
>  [<c0105cf3>] kernel_thread_helper+0x7/0x14
>  =======================
> Disabling non-boot CPUs ...
> Unmapping cpu 1 from all nodes
> Unmapping cpu 1 from all nodes
> CPU 1 is now offline
> lockdep: fixing up alternatives.
> SMP alternatives: switching to UP code
> CPU0 attaching NULL sched-domain.
> CPU1 attaching NULL sched-domain.
> CPU0 attaching sched-domain:
>  domain 0: span 0 level CPU
>  groups: 0
> CPU1 is down
> Power down.
> acpi_power_off called
>  hwsleep-0326 [00] enter_sleep_state     : Entering sleep state [S5]
>
> ...the machine would shut down, but not resume:
>
> Trying to resume from /dev/VolGroup00/LogVol01
> No suspend signature on swap, not resuming.
> Creating root device.
> Mounting root filesystem.
>
> I previously also saw some SLUB errors on resuming (but those didn't
> make it to the serial console, unfortunately); will post follow-up if
> I can manage to get a capture.
>
> I don't know if it's related, but I was reading some files from /sys
> while doing the "echo disk > /sys/power/state".
>
>
> Vegard
>
> --
> "The animistic metaphor of the bug that maliciously sneaked in while
> the programmer was not looking is intellectually dishonest as it
> disguises that the error is the programmer's own creation."
>        -- E. W. Dijkstra, EWD1036
>



-- 
Best regards,
Dmitry Adamushko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/