linux-kernel - cpufreq deadlock (?sysfs related?)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 22 Nov 2007 11:46:06 +0100
From:	Jiri Slaby <jirislaby@...il.com>
To:	davej@...emonkey.org.uk
CC:	cpufreq@...ts.linux.org.uk,
	Linux kernel mailing list <linux-kernel@...r.kernel.org>,
	Greg KH <gregkh@...e.de>
Subject: cpufreq deadlock (?sysfs related?)

Hi,

some people hit this bug, I'm able to reproduce it too, but I'm out of ideas
what could have cause it. Here are some traces of D processes:

tee           D ffff8100041e3c28     0 15798  13503
 ffff8100041e3bc8 0000000000000086 0000000000000000 ffffffff804a007b
 ffff8100041e3be8 ffff8100029d8000 ffff810001fc9560 ffffffff8828e094
 ffff810008af1ba0 ffff8100029d8000 ffff810001fc9560 ffff81002ae74e40
Call Trace:
 [<ffffffff804a007b>] thread_return+0x0/0x5a5
 [<ffffffff8828e094>] :cpufreq_stats:cpufreq_stats_update+0x54/0x70 /*
_spin_unlock */
 [<ffffffff804a0731>] wait_for_completion+0xa1/0xf0
 [<ffffffff8022ee90>] default_wake_function+0x0/0x10
 [<ffffffff802d583f>] sysfs_addrm_finish+0x1ef/0x270 /* wait_for_completion */
 [<ffffffff802d3fb6>] sysfs_hash_and_remove+0xa6/0xc0
 [<ffffffff880120cc>] :cpufreq_userspace:cpufreq_governor_userspace+0xac/0x220
 [<ffffffff803c4cd2>] __cpufreq_governor+0x32/0x110
 [<ffffffff803c56f3>] __cpufreq_set_policy+0x113/0x180
 [<ffffffff803c5852>] store_scaling_governor+0xf2/0x200
 [<ffffffff803c5f50>] handle_update+0x0/0x10

here started the scheduled work ^^^

 [<ffffffff8026a2c3>] __alloc_pages+0x73/0x320 /* get_page_from_freelist */
 [<ffffffff803c662b>] store+0x7b/0x90 /* fattr->store(), holding write sem */
 [<ffffffff802d46ef>] sysfs_write_file+0xcf/0x150
 [<ffffffff8028c778>] vfs_write+0xc8/0x170
 [<ffffffff8028ce53>] sys_write+0x53/0x90
 [<ffffffff8020bd6e>] system_call+0x7e/0x83

tee           D 0000000000000000     0 15800  22179
 ffff8100058b7e28 0000000000000086 0000000000000000 001bdcd0000f32a0
 0000000000000000 ffff810005f8ce40 ffff810001fd6e40 0000000000000001
 0000000101281d48 000000008057bec0 0000000000000003 000280d000000000
Call Trace:
 [<ffffffff804a1ed9>] __down_write_nested+0x79/0xc0
 [<ffffffff803c5e21>] lock_policy_rwsem_write+0x41/0x80
 [<ffffffff803c660c>] store+0x5c/0x90
 [<ffffffff802d46ef>] sysfs_write_file+0xcf/0x150
 [<ffffffff8028c778>] vfs_write+0xc8/0x170
 [<ffffffff8028ce53>] sys_write+0x53/0x90
 [<ffffffff8020bd6e>] system_call+0x7e/0x83

cat           D 0000000000000000     0 15801  13587
 ffff810008f69e38 0000000000000082 0000000000000000 ffffffff8026a119
 ffffffff80624fa0 ffff810005f8d560 ffffffff805703a0 ffffffff8057bec0
 0000000101282286 0000000000000000 0000000000000003 000000010697e4a0
Call Trace:
 [<ffffffff8026a119>] get_page_from_freelist+0x339/0x470
 [<ffffffff804a1fa9>] __down_read+0x79/0xb2 /* schedule */
 [<ffffffff803c6681>] lock_policy_rwsem_read+0x41/0x80
 [<ffffffff803c67cd>] show+0x4d/0x80
 [<ffffffff802d4bed>] sysfs_read_file+0x9d/0x150
 [<ffffffff8028c8e5>] vfs_read+0xc5/0x160
 [<ffffffff8028cdc3>] sys_read+0x53/0x90
 [<ffffffff8020bd6e>] system_call+0x7e/0x83

I reproduced it by executing these 3 scripts:
#!/bin/bash

while true
do
echo userspace | tee /sys/devices/system/cpu/*/cpufreq/scaling_governor
cat /sys/devices/system/cpu/*/cpufreq/scaling_governor
sleep 2
echo performance | tee /sys/devices/system/cpu/*/cpufreq/scaling_governor
cat /sys/devices/system/cpu/*/cpufreq/scaling_governor
sleep 4
echo ondemand | tee /sys/devices/system/cpu/*/cpufreq/scaling_governor
cat /sys/devices/system/cpu/*/cpufreq/scaling_governor
sleep 10
done

#!/bin/bash

while true
do
echo userspace | tee /sys/devices/system/cpu/*/cpufreq/scaling_governor
cat /sys/devices/system/cpu/*/cpufreq/scaling_governor
for (( a = 0; a < 10000; a++ )); do echo aaa >/dev/null; done
echo performance | tee /sys/devices/system/cpu/*/cpufreq/scaling_governor
cat /sys/devices/system/cpu/*/cpufreq/scaling_governor
for (( a = 0; a < 11500; a++ )); do echo aaa >/dev/null; done
echo ondemand | tee /sys/devices/system/cpu/*/cpufreq/scaling_governor
cat /sys/devices/system/cpu/*/cpufreq/scaling_governor
sleep 1
done

#!/bin/bash

while true
do cat /sys/devices/system/cpu/*/cpufreq/*
sleep 10
done

This was on 2.6.23, however it is known to be also in older kernels. I
suspect same race between wait_for_completion and complete() -- it wakes
only one process, but not the other in sysfs/dir.c (would complete_all
help?)? Any ideas?

Here are events workers (seems OK):
events/1      S 0000000000000000     0 13126      2
 ffff81003fe71ec0 0000000000000046 0000000000000000 0000000000000202
 ffff810001e120c0 ffff81003d821560 ffff810001fd6e40 00000000ffffffff
 00000001012df7e1 0000000080248f2a 0000000000000003 ffff810001e120c8
Call Trace:
 [<ffffffff80249060>] worker_thread+0x0/0x130
 [<ffffffff80249165>] worker_thread+0x105/0x130
 [<ffffffff8024cb20>] autoremove_wake_function+0x0/0x30
 [<ffffffff80249060>] worker_thread+0x0/0x130
 [<ffffffff80249060>] worker_thread+0x0/0x130
 [<ffffffff8024c71b>] kthread+0x4b/0x80
 [<ffffffff8020cbb8>] child_rip+0xa/0x12
 [<ffffffff8024c6d0>] kthread+0x0/0x80
 [<ffffffff8020cbae>] child_rip+0x0/0x12

events/0      S 0000000000000000     0     7      2
 ffff81003ff0bec0 0000000000000046 0000000000000000 0000000000000202
 ffff810001e0a0c0 ffff810001fd7560 ffffffff805703a0 00000000ffffffff
 00000001012df7c6 0000000080248f2a 0000000000000003 ffff810001e0a0c8
Call Trace:
 [<ffffffff80249060>] worker_thread+0x0/0x130
 [<ffffffff80249165>] worker_thread+0x105/0x130
 [<ffffffff8024cb20>] autoremove_wake_function+0x0/0x30
 [<ffffffff80249060>] worker_thread+0x0/0x130
 [<ffffffff80249060>] worker_thread+0x0/0x130
 [<ffffffff8024c71b>] kthread+0x4b/0x80
 [<ffffffff8020cbb8>] child_rip+0xa/0x12
 [<ffffffff8024c6d0>] kthread+0x0/0x80
 [<ffffffff8020cbae>] child_rip+0x0/0x12

I'm now trying to reproduce this with LOCKDEP on, but it's not easy to get
to that state.

thanks,
-- 
http://www.fi.muni.cz/~xslaby/            Jiri Slaby
faculty of informatics, masaryk university, brno, cz

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/