[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181113135453.GW9144@intel.com>
Date: Tue, 13 Nov 2018 15:54:53 +0200
From: Ville Syrjälä <ville.syrjala@...ux.intel.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: linux-kernel@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Viresh Kumar <viresh.kumar@...aro.org>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>
Subject: [REGRESSION 4.20-rc1] 45975c7d21a1 ("rcu: Define RCU-sched API in
terms of RCU for Tree RCU PREEMPT builds")
Hi Paul,
After 4.20-rc1 some of my 32bit UP machines no longer reboot/shutdown.
I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched
API in terms of RCU for Tree RCU PREEMPT builds").
I traced the hang into
-> cpufreq_suspend()
-> cpufreq_stop_governor()
-> cpufreq_dbs_governor_stop()
-> gov_clear_update_util()
-> synchronize_sched()
-> synchronize_rcu()
Only PREEMPT=y is affected for obvious reasons, but that couldn't
explain why the same UP kernel booted on an SMP machine worked fine.
Eventually I realized that the difference between working and
non-working machine was IOAPIC vs. PIC. With initcall_debug I saw
that we mask everything in the PIC before cpufreq is shut down,
and came up with the following fix:
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 7aa3dcad2175..f88bf3c77fc0 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2605,4 +2605,4 @@ static int __init cpufreq_core_init(void)
return 0;
}
module_param(off, int, 0444);
-core_initcall(cpufreq_core_init);
+late_initcall(cpufreq_core_init);
Here's the resulting change in inutcall_debug:
pci 0000:00:00.1: shutdown
hub 4-0:1.0: hub_ext_port_status failed (err = -110)
agpgart-intel 0000:00:00.0: shutdown
+ PM: Calling cpufreq_suspend+0x0/0x100
PM: Calling mce_syscore_shutdown+0x0/0x10
PM: Calling i8259A_shutdown+0x0/0x10
- PM: Calling cpufreq_suspend+0x0/0x100
+ reboot: Restarting system
+ reboot: machine restart
I didn't really look into what other ramifications the cpufreq
initcall change might have. cpufreq_global_kobject worries
me a bit. Maybe that one has to remain in core_initcall() and
we could just move the suspend to late_initcall()? Anyways,
I figured I'd leave this for someone more familiar with the
code to figure out ;)
--
Ville Syrjälä
Intel
Powered by blists - more mailing lists