linux-kernel - Re: [PATCH] cpufreq, store_scaling_governor requires policy->rwsem to be held for duration of changing governors [v2]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKohpom0KycrB8qJDM7yEOHOn76J1sEe90CG+JXERJMcxKnOJA@mail.gmail.com>
Date:	Mon, 4 Aug 2014 16:06:44 +0530
From:	Viresh Kumar <viresh.kumar@...aro.org>
To:	Stephen Boyd <sboyd@...eaurora.org>
Cc:	Prarit Bhargava <prarit@...hat.com>,
	Saravana Kannan <skannan@...eaurora.org>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Lenny Szubowicz <lszubowi@...hat.com>,
	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
	Robert Schöne <robert.schoene@...dresden.de>
Subject: Re: [PATCH] cpufreq, store_scaling_governor requires policy->rwsem to
 be held for duration of changing governors [v2]

Sorry for the delay guys, was away :(

Adding Robert as well, he reported something similar so better discuss here.

On 1 August 2014 22:48, Stephen Boyd <sboyd@...eaurora.org> wrote:
> This was with conservative as the default, and switching to ondemand
>
> # cd /sys/devices/system/cpu/cpu2/cpufreq
> # ls
> affected_cpus                  scaling_available_governors
> conservative                   scaling_cur_freq
> cpuinfo_cur_freq               scaling_driver
> cpuinfo_max_freq               scaling_governor
> cpuinfo_min_freq               scaling_max_freq
> cpuinfo_transition_latency     scaling_min_freq
> related_cpus                   scaling_setspeed
> scaling_available_frequencies  stats
> # cat conservative/down_threshold
> 20
> # echo ondemand > scaling_governor
>
>  ======================================================
>  [ INFO: possible circular locking dependency detected ]
>  3.16.0-rc3-00039-ge1e38f124d87 #47 Not tainted
>  -------------------------------------------------------
>  sh/75 is trying to acquire lock:
>   (s_active#9){++++..}, at: [<c0358a94>] kernfs_remove_by_name_ns+0x3c/0x84
>
>  but task is already holding lock:
>   (&policy->rwsem){+++++.}, at: [<c05ab1f0>] store+0x68/0xb8
>
>  which lock already depends on the new lock.
>
>
>  the existing dependency chain (in reverse order) is:
>
> -> #1 (&policy->rwsem){+++++.}:
>         [<c0359234>] kernfs_fop_open+0x138/0x298
>         [<c02fa3f4>] do_dentry_open.isra.12+0x1b0/0x2f0
>         [<c02fa604>] finish_open+0x20/0x38
>         [<c0308d34>] do_last.isra.37+0x5ac/0xb68
>         [<c03093a4>] path_openat+0xb4/0x5d8
>         [<c0309bcc>] do_filp_open+0x2c/0x80
>         [<c02fb558>] do_sys_open+0x10c/0x1c8
>         [<c020f0a0>] ret_fast_syscall+0x0/0x48
>
> -> #0 (s_active#9){++++..}:
>         [<c0357d18>] __kernfs_remove+0x250/0x300
>         [<c0358a94>] kernfs_remove_by_name_ns+0x3c/0x84
>         [<c035aa78>] remove_files+0x34/0x78
>         [<c035aee0>] sysfs_remove_group+0x40/0x98
>         [<c05b0560>] cpufreq_governor_dbs+0x4c0/0x6ec
>         [<c05abebc>] __cpufreq_governor+0x118/0x200
>         [<c05ac0fc>] cpufreq_set_policy+0x158/0x2ac
>         [<c05ad5e4>] store_scaling_governor+0x6c/0x94
>         [<c05ab210>] store+0x88/0xb8
>         [<c035a00c>] sysfs_kf_write+0x4c/0x50
>         [<c03594d4>] kernfs_fop_write+0xc0/0x180
>         [<c02fc5c8>] vfs_write+0xa0/0x1a8
>         [<c02fc9d4>] SyS_write+0x40/0x8c
>         [<c020f0a0>] ret_fast_syscall+0x0/0x48
>
>  other info that might help us debug this:
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(&policy->rwsem);
>                                 lock(s_active#9);
>                                 lock(&policy->rwsem);
>    lock(s_active#9);
>
>   *** DEADLOCK ***
>
>  6 locks held by sh/75:
>   #0:  (sb_writers#4){.+.+..}, at: [<c02fc6a8>] vfs_write+0x180/0x1a8
>   #1:  (&of->mutex){+.+...}, at: [<c0359498>] kernfs_fop_write+0x84/0x180
>   #2:  (s_active#10){.+.+..}, at: [<c03594a0>] kernfs_fop_write+0x8c/0x180
>   #3:  (cpu_hotplug.lock){++++++}, at: [<c0221ef8>] get_online_cpus+0x38/0x9c
>   #4:  (cpufreq_rwsem){.+.+.+}, at: [<c05ab1d8>] store+0x50/0xb8
>   #5:  (&policy->rwsem){+++++.}, at: [<c05ab1f0>] store+0x68/0xb8
>
>  stack backtrace:
>  CPU: 0 PID: 75 Comm: sh Not tainted 3.16.0-rc3-00039-ge1e38f124d87 #47
>  [<c0214de8>] (unwind_backtrace) from [<c02123f8>] (show_stack+0x10/0x14)
>  [<c02123f8>] (show_stack) from [<c0709e5c>] (dump_stack+0x70/0xbc)
>  [<c0709e5c>] (dump_stack) from [<c070722c>] (print_circular_bug+0x280/0x2d4)
>  [<c070722c>] (print_circular_bug) from [<c02629cc>] (__lock_acquire+0x18d0/0x1abc)
>  [<c02629cc>] (__lock_acquire) from [<c026310c>] (lock_acquire+0x9c/0x138)
>  [<c026310c>] (lock_acquire) from [<c0357d18>] (__kernfs_remove+0x250/0x300)
>  [<c0357d18>] (__kernfs_remove) from [<c0358a94>] (kernfs_remove_by_name_ns+0x3c/0x84)
>  [<c0358a94>] (kernfs_remove_by_name_ns) from [<c035aa78>] (remove_files+0x34/0x78)
>  [<c035aa78>] (remove_files) from [<c035aee0>] (sysfs_remove_group+0x40/0x98)
>  [<c035aee0>] (sysfs_remove_group) from [<c05b0560>] (cpufreq_governor_dbs+0x4c0/0x6ec)
>  [<c05b0560>] (cpufreq_governor_dbs) from [<c05abebc>] (__cpufreq_governor+0x118/0x200)
>  [<c05abebc>] (__cpufreq_governor) from [<c05ac0fc>] (cpufreq_set_policy+0x158/0x2ac)
>  [<c05ac0fc>] (cpufreq_set_policy) from [<c05ad5e4>] (store_scaling_governor+0x6c/0x94)
>  [<c05ad5e4>] (store_scaling_governor) from [<c05ab210>] (store+0x88/0xb8)
>  [<c05ab210>] (store) from [<c035a00c>] (sysfs_kf_write+0x4c/0x50)
>  [<c035a00c>] (sysfs_kf_write) from [<c03594d4>] (kernfs_fop_write+0xc0/0x180)
>  [<c03594d4>] (kernfs_fop_write) from [<c02fc5c8>] (vfs_write+0xa0/0x1a8)
>  [<c02fc5c8>] (vfs_write) from [<c02fc9d4>] (SyS_write+0x40/0x8c)
>  [<c02fc9d4>] (SyS_write) from [<c020f0a0>] (ret_fast_syscall+0x0/0x48)

Thanks for coming to my rescue Stephen :), I was quite sure I got this
with ondemand
as well..

I will be looking very closely at the code now to see what's going wrong.
And btw, does anybody here has the exact understanding of why this
lockdep does happen? I mean what was the real problem for which we
just dropped the rwsems.. I understood that earlier but couldn't get that
again :)

Thanks all for you work on getting this fixed.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/