lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu,  3 Oct 2019 17:08:28 +0300
From:   Ville Syrjala <ville.syrjala@...ux.intel.com>
To:     linux-kernel@...r.kernel.org
Cc:     stable@...r.kernel.org,
        "Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
        Andi Kleen <ak@...ux.intel.com>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        linux-pm@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Ville Syrjälä 
        <ville.syrjala@...ux.intel.com>
Subject: [PATCH] cpufreq: Fix RCU reboot regression on x86 PIC machines

From: Ville Syrjälä <ville.syrjala@...ux.intel.com>

Since 4.20-rc1 my PIC machines no longer reboot/shutdown.
I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched
API in terms of RCU for Tree RCU PREEMPT builds").

I traced the hang into
-> cpufreq_suspend()
 -> cpufreq_stop_governor()
  -> cpufreq_dbs_governor_stop()
   -> gov_clear_update_util()
    -> synchronize_sched()
     -> synchronize_rcu()

Only PREEMPT=y is affected for obvious reasons. The problem
is limited to PIC machines since they mask off interrupts
in i8259A_shutdown() (syscore_ops.shutdown() registered
from device_initcall()).

I reported this long ago but no better fix has surfaced,
hence sending out my initial workaround which I've been
carrying around ever since. I just move cpufreq_core_init()
to late_initcall() so the syscore_ops get registered in the
oppsite order and thus the .shutdown() hooks get executed
in the opposite order as well. Not 100% convinced this is
safe (especially moving the cpufreq_global_kobject creation
to late_initcall()) but I've not had any problems with it
at least.

Here's the resulting change in initcall_debug:
+ PM: Calling cpufreq_suspend+0x0/0x100
  PM: Calling mce_syscore_shutdown+0x0/0x10
  PM: Calling i8259A_shutdown+0x0/0x10
- PM: Calling cpufreq_suspend+0x0/0x100
+ reboot: Restarting system
+ reboot: machine restart

Cc: stable@...r.kernel.org
Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
Cc: Andi Kleen <ak@...ux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>
Cc: Viresh Kumar <viresh.kumar@...aro.org>
Cc: linux-pm@...r.kernel.org
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Borislav Petkov <bp@...en8.de>
Cc: "H. Peter Anvin" <hpa@...or.com>
Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds")
Signed-off-by: Ville Syrjälä <ville.syrjala@...ux.intel.com>
---
 drivers/cpufreq/cpufreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index c52d6fa32aac..6a8fb9b08e33 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2761,4 +2761,4 @@ static int __init cpufreq_core_init(void)
 	return 0;
 }
 module_param(off, int, 0444);
-core_initcall(cpufreq_core_init);
+late_initcall(cpufreq_core_init);
-- 
2.21.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ