lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 23 Dec 2014 22:10:15 -0800 From: Doug Anderson <dianders@...omium.org> To: Viresh Kumar <viresh.kumar@...aro.org> Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>, Heiko Stuebner <heiko@...ech.de>, Dmitry Torokhov <dmitry.torokhov@...il.com>, Chris Zhong <zyw@...k-chips.com>, "linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>, Sonny Rao <sonnyrao@...omium.org>, Dylan Reid <dgreid@...omium.org>, "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org> Subject: Re: [PATCH] cpufreq: suspend cpufreq governors on shutdown Virseh, On Tue, Dec 23, 2014 at 5:46 PM, Viresh Kumar <viresh.kumar@...aro.org> wrote: > On 24 December 2014 at 05:41, Doug Anderson <dianders@...omium.org> wrote: >> We should stop cpufreq governors when we shut down the system. If we >> don't do this, we can end up with this deadlock: > > Can you also add what exactly happens in such deadlock? Some lockdeps? > Or just a hang ? It only fails 1 in ~50 calls to halt. When it fails, most times I get a hang. Sometimes I actually got a hard lockup: [ 23.511813] reboot: Power down [ 40.418425] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 1 I'm slightly amazed that the hard lockup detector even fired (and I'm not sure why it sometimes doesn't fire). ...when I use kdb to get a trace at this point: Stack traceback for pid 6564 0xecd2ee00 6564 1 0 0 D 0xecd2f1e8 halt [<c065ab64>] (__schedule) from [<c065ae9c>] (schedule+0xa4/0xa8) [<c065ae9c>] (schedule) from [<c065b21c>] (schedule_preempt_disabled+0x18/0x1c) [<c065b21c>] (schedule_preempt_disabled) from [<c065bfe8>] (mutex_lock_nested+0x294/0x3e4) [<c065bfe8>] (mutex_lock_nested) from [<c041c0d0>] (regmap_lock_mutex+0x1c/0x20) [<c041c0d0>] (regmap_lock_mutex) from [<c041e618>] (regmap_update_bits+0x34/0x6c) [<c041e618>] (regmap_update_bits) from [<c042a9bc>] (rk808_device_shutdown+0x54/0x7c) [<c042a9bc>] (rk808_device_shutdown) from [<c01070b8>] (machine_power_off+0x34/0x3c) [<c01070b8>] (machine_power_off) from [<c0147168>] (kernel_power_off+0x68/0x7c) [<c0147168>] (kernel_power_off) from [<c0147360>] (SyS_reboot+0x154/0x1fc) [<c0147360>] (SyS_reboot) from [<c01062a0>] (ret_fast_syscall+0x0/0x48) I'm not used to using lockdep. Trying now. OK, no lockdep warning. I'm relatively confident that I understand the problem because I added a bunch of other debug info. You can see my patch at <https://chromium-review.googlesource.com/#/c/237455/>. In the case I captured I confirmed that I was stopping CPU1 while it held the mutex. >> 1. cpufreq governor may be running on a CPU other than CPU0. >> 2. In machine_restart() we call smp_send_stop() which stops CPUs. >> If one of these CPUs was actively running a cpufreq governor >> then it may have the mutex / spinlock needed to access the main >> PMIC in the system (perhaps over I2C) >> 3. If a machine needs access to the main PMIC in order to shutdown >> then it will never get it since the mutex was lost when the other >> CPU stopped. >> >> Let's avoid the race by stopping the cpufreq governor at shutdown, >> which is a sensible thing to do anyway. >> >> Signed-off-by: Doug Anderson <dianders@...omium.org> >> --- >> NOTE: This was developed / tested / on a 3.14 kernel with backports. >> I have confirmed that it compiles on a mainline kernel and doesn't >> crash, but I haven't verified that there isn't some other fix in >> mainline that also fixes this problem. If you are aware of such a fix >> then please drop this patch. > > No, its a new problem. OK, good to know. >> drivers/cpufreq/cpufreq.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c >> index a09a29c..bd89721 100644 >> --- a/drivers/cpufreq/cpufreq.c >> +++ b/drivers/cpufreq/cpufreq.c >> @@ -27,6 +27,7 @@ >> #include <linux/mutex.h> >> #include <linux/slab.h> >> #include <linux/suspend.h> >> +#include <linux/syscore_ops.h> >> #include <linux/tick.h> >> #include <trace/events/power.h> >> >> @@ -2550,6 +2551,15 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver) >> } >> EXPORT_SYMBOL_GPL(cpufreq_unregister_driver); >> >> +static void cpufreq_shutdown(void) >> +{ >> + cpufreq_suspend(); >> +} >> + >> +static struct syscore_ops cpufreq_syscore_ops = { >> + .shutdown = cpufreq_shutdown, > > directly pass cpufreq_suspend() here and add a note over the > cpufreq_syscore_ops on why it exists here. Done. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists