[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKohpokEdnmR4wsAqZgxnLOd5MAn6RNEFYwJQ8Xv1EdRbX1tkQ@mail.gmail.com>
Date: Mon, 20 May 2013 19:13:08 +0530
From: Viresh Kumar <viresh.kumar@...aro.org>
To: Borislav Petkov <bp@...en8.de>
Cc: Michael Wang <wangyun@...ux.vnet.ibm.com>,
Tejun Heo <tj@...nel.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Jiri Kosina <jkosina@...e.cz>,
Frederic Weisbecker <fweisbec@...il.com>,
Tony Luck <tony.luck@...el.com>, linux-kernel@...r.kernel.org,
x86@...nel.org, Thomas Gleixner <tglx@...utronix.de>, rjw@...k.pl,
cpufreq@...r.kernel.org, linux-pm@...r.kernel.org
Subject: Re: NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule,
round 2
On 20 May 2013 18:53, Borislav Petkov <bp@...en8.de> wrote:
> I just confirmed that policy->cpus contains offlined cores with this:
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 5af40ad82d23..e8c25f71e9b6 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -169,6 +169,9 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
> {
> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>
> + if (WARN_ON(!cpu_online(cpu)))
> + return;
> +
> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> }
Hmm, so for sure there is some locking issue there.
Have you tried my patch? I am not sure if it will fix everything but may
fix it.
> see splats collection below.
>
> And I don't think your fix above addresses the issue for the simple
> reason that if cpus go offline *before* you do get_online_cpus(), then
> policy->cpus will already contain offlined cpus.
>
> Rather, a better fix would be, IMHO, to do this (it works here, of course):
>
> ---
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 5af40ad82d23..58541b164494 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -17,6 +17,7 @@
> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>
> #include <asm/cputime.h>
> +#include <linux/cpu.h>
> #include <linux/cpufreq.h>
> #include <linux/cpumask.h>
> #include <linux/export.h>
> @@ -169,7 +170,15 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
> {
> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>
> + get_online_cpus();
> +
> + if (!cpu_online(cpu))
> + goto out;
> +
> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> +
> + out:
> + put_online_cpus();
> }
>
> void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
This looks fine, but I want to fix the locking rather than just
hiding the issue. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists