[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52304F5A.8070509@linux.vnet.ibm.com>
Date: Wed, 11 Sep 2013 16:39:14 +0530
From: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To: "Rafael J. Wysocki" <rjw@...k.pl>
CC: Stephen Warren <swarren@...dotorg.org>,
Viresh Kumar <viresh.kumar@...aro.org>,
"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
cpufreq <cpufreq@...r.kernel.org>
Subject: Re: cpufreq_stats NULL deref on second system suspend
On 09/11/2013 03:51 PM, Srivatsa S. Bhat wrote:
> On 09/11/2013 04:04 AM, Rafael J. Wysocki wrote:
>> On Tuesday, September 10, 2013 02:53:01 PM Stephen Warren wrote:
>>> On 09/09/2013 05:14 PM, Rafael J. Wysocki wrote:
>>>> On Monday, September 09, 2013 03:29:06 PM Stephen Warren wrote:
>>>>> On 09/09/2013 02:24 PM, Rafael J. Wysocki wrote:
>>>>>> On Monday, September 09, 2013 02:01:32 PM Stephen Warren wrote:
>>>>>>> On 09/09/2013 02:01 PM, Rafael J. Wysocki wrote:
>>>>>>>> On Monday, September 09, 2013 01:22:23 PM Stephen Warren wrote:
>>>>>>>>> Viresh,
>>>>>>>>>
>>>>>>>>> I'm seeing the crash below when suspending my system for the second time.
>>>>>>>>>
>>>>>>>>> I can avoid this with the following patch, which adds a check which
>>>>>>>>> already exists in all-but-one other places that the same lookup is made:
>>>>>>>>
>>>>>>>> Which kernel did you test?
>>>>>>>
>>>>>>> next-20130909.
>>>>>>
>>>>>> Is it reproducible with the current mainline?
>>>>>
>>>>> This does not affect v3.11, but does affect current HEAD; 300893b "Merge
>>>>> tag 'xfs-for-linus-v3.12-rc1' of git://oss.sgi.com/xfs/xfs".
>>>>
>>>> What system does it break on?
>>>
>>> A dual-core ARM system (NVIDIA Tegra20 SoC, Harmony board).
>>>
>>>> Any chance to bisect cpufreq changes between 3.11 and the current HEAD?
>>>
>>> Sure, it's due to 5302c3f "cpufreq: Perform light-weight init/teardown
>>> during suspend/resume".
>>
>> Thanks!
>>
>> Srivatsa, any chance to look into this?
>>
>
> Sure, Rafael. Thanks for CC'ing me.
>
> Stephen, I went through the code and I think I found out what is going wrong.
> Can you please try the following patch?
>
> Regards,
> Srivatsa S. Bhat
>
> ----------------------------------------------------------------------------
>
>
> From: Srivatsa S. Bhat <srivatsa.bhat@...ux.vnet.ibm.com>
> Subject: [PATCH] cpufreq: Fix crash in cpufreq-stats during suspend/resume
>
And here is a follow-up patch, on top of this one. I hit the bug while working
on this patch, but it doesn't occur in practice, since none of the existing
callers call update_policy_cpu() with new == old. (I had called it like that
by mistake while working on the fix and was surprised by the level of problems
it caused; hence thought of adding a check).
Regards,
Srivatsa S. Bhat
-----------------------------------------------------------------------------
From: Srivatsa S. Bhat <srivatsa.bhat@...ux.vnet.ibm.com>
Subject: [PATCH] cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu
If update_policy_cpu() is invoked with the existing policy->cpu itself
as the new-cpu parameter, then a lot of things can go terribly wrong.
In its present form, update_policy_cpu() always assumes that the new-cpu
is different from policy->cpu and invokes other functions to perform their
respective updates. And those functions implement the actual update like
this:
per_cpu(..., new_cpu) = per_cpu(..., last_cpu);
per_cpu(..., last_cpu) = NULL;
Thus, when new_cpu == last_cpu, the final NULL assignment makes the per-cpu
references vanish into thin air! (memory leak). From there, it leads to more
problems: cpufreq_stats_create_table() now doesn't find the per-cpu reference
and hence tries to create a new sysfs-group; but sysfs already had created
the group earlier, so it complains that it cannot create a duplicate filename.
In short, the repercussions of a rather innocuous invocation of
update_policy_cpu() can turn out to be pretty nasty.
Ideally update_policy_cpu() should handle this situation (new == last)
gracefully, and not lead to such severe problems. So fix it by adding an
appropriate check.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@...ux.vnet.ibm.com>
---
drivers/cpufreq/cpufreq.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 62bdb95..a61b7a1 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -949,6 +949,9 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy)
static void update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu)
{
+ if (cpu == policy->cpu)
+ return;
+
policy->last_cpu = policy->cpu;
policy->cpu = cpu;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists