linux-kernel - Re: cpufreq_stats NULL deref on second system suspend

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKohpo=8mqRmmk17D52yiR9w1V4Z+2UzZ2kzABtrxHbYR+sQzw@mail.gmail.com>
Date:	Thu, 12 Sep 2013 11:22:52 +0530
From:	Viresh Kumar <viresh.kumar@...aro.org>
To:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Cc:	"Rafael J. Wysocki" <rjw@...k.pl>,
	Stephen Warren <swarren@...dotorg.org>,
	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	cpufreq <cpufreq@...r.kernel.org>
Subject: Re: cpufreq_stats NULL deref on second system suspend

Let me fix my mail first.. I was running out of time yesterday and so couldn't
frame things correctly :)

On 11 September 2013 17:29, Viresh Kumar <viresh.kumar@...aro.org> wrote:
> Okay.. There are two different ways in which cpufreq_add_dev() work
> currently..
>
> Boot cluster (i.e. policy with boot CPU)
> ---------------
>
> Here cpufreq_remove_dev() is never called for boot cpu but all others.
> And similarly cpufreq_add_dev() is never called for boot cpu but all others.
>
> Now policy->cpu contains meaningful cpu at beginning of resume and
> we don't need to modify that at all.. For all the remaining CPUs we
> better call cpufreq_add_policy_cpu() rather..

And this should be done without your patch. Or actually we will simply
return from this place. Atleast for systems with single cluster, like Tegra.

policy->related_cpus is still valid after resume and we haven't removed
policy from the cpufreq_policy_list (though there is a bug which I have
fixed separately and sent it to you..).. So no change required for a single
cluster system..

> Non-boot Cluster
> ---------------------
>
> All CPUs here are removed and at the end policy->cpu contains the last
> cpu removed.. So, for a cluster with cpu 2 and 3.... it will contain 3..
>
> Now at resume we will add cpu2 first and so need to update policy->cpu
> to 2..

> But for all other CPUs in this cluster we return early from
> cpufreq_add_dev() and call cpufreq_add_policy_cpu() as policy->cpus
> was fixed by call to ->init() for the first cpu of this cluster..

This was wrong, we need a valid policy->related_cpus field which is always
valid and so we return early here too, but not for the first cpu of cluster.

> And so we never reach the line: policy->cpu = cpu;
>
> For the first cpu of non-boot cluster we need to call update_policy_cpu()
> and not for others..

that's correct, thought I have one more idea.. :)

> But for the boot cluster if we can call ->init() somehow at resume time,
> then things would be fairly similar in both cases..

Not required.. its all working already.. and so Stephen shouldn't need your
patch for Tegra, but rather my patches that fix other cpufreq bugs..

Now coming back to the ideas I have...
Same code will work if hotplug sequence is fixed a bit. Why aren't we doing
exact opposite of suspend in resume?

We are removing CPUs (leaving the boot cpu) in ascending order and then
adding them back in same order.. Why?

Why not remove CPUs in descending order and add in ascending order? Or
remove in ascending order and add in descending order?

That way policy->cpu will be updated with the right cpu and your patch wouldn't
be required..

I am not saying that this can't be hacked/fixed in cpufreq but suspend/resume
may also be fixed and that looks logically more correct to me..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/