linux-kernel - Re: [PATCH 2/2] cpufreq: serialize calls to __cpufreq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5229704B.7010200@linux.vnet.ibm.com>
Date:	Fri, 06 Sep 2013 11:33:55 +0530
From:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To:	Stephen Boyd <sboyd@...eaurora.org>
CC:	"Rafael J. Wysocki" <rjw@...k.pl>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	cpufreq@...r.kernel.org, linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] cpufreq: serialize calls to __cpufreq_governor()

On 09/05/2013 06:24 AM, Stephen Boyd wrote:
> On 09/04/13 17:26, Rafael J. Wysocki wrote:
>> On Wednesday, September 04, 2013 04:50:01 PM Stephen Boyd wrote:
>>> On 09/04/13 16:55, Rafael J. Wysocki wrote:
>>>> Well, I'm not sure when Viresh is going to be back.
>>>>
>>>> Srivatsa, can you please resend this patch with a proper changelog?
>>>>
>>> I haven't had a chance to try this out yet, but I was just thinking
>>> about this patch. How is it going to work? If one task opens the file
>>> and another task is taking down the CPU wouldn't we deadlock in the
>>> CPU_DOWN notifier waiting for the kobject to be released? Task 1 will
>>> grab the kobject reference and sleep on the hotplug mutex and task 2
>>> will put the kobject and wait for the completion, but it won't happen.
>>> At least I think that's what would happen.
>> Do you mean the completion in sysfs_deactivate()?  Yes, we can deadlock
>> there.
> 
> I mean the complete in cpufreq_sysfs_release(). I don't think that will
> ever be called because the kobject is held by the task calling store()
> which is waiting on the hotplug lock to be released.
> 
>>
>> Well, I guess the Srivatsa's patch may be salvaged by making it do a "trylock"
>> version of get_online_cpus(), but then I wonder if there's no better way.
> 
> I think the real solution is to remove the kobject first if the CPU
> going down is the last user of that policy. Once the completion is done
> we can stop the governor and clean up state. For the case where there
> are still CPUs using the policy why can't we cancel that CPU's work and
> do nothing else? Even in the case where we have to move the cpufreq
> directory do we need to do a STOP/START/LIMITS sequence? I hope we can
> get away with just moving the directory and canceling that CPUs work then.
> 

Conceptually, I agree that your idea of not allowing any process to take
a new reference to the kobject while we are taking the CPU offline, is a
sound solution.

However, I am reluctant to go down that path because, handling the CPU hotplug
sequence in the suspend/resume path might get even more tricky, if we want
to implement the changes that you propose. Just recently we managed to
rework the cpufreq CPU hotplug handling to retain the sysfs file permissions
around suspend/resume, and doing that was not at all simple. Adding more
quirks and complexity to the kobject handling in that path will only make
things even more challenging, IMHO. That's the reason I'm trying to think
of ways to avoid touching that fragile code, and instead solve this problem
in some other way, without compromising on the robustness of the solution.

So here is my new proposal, as a replacement to this patch[2/2]:

We note that, at CPU_DOWN_PREPARE stage, the CPU is not yet marked offline,
whereas by the time we handle CPU_POST_DEAD, the CPU is removed from the
cpu_online_mask, and also the cpu_hotplug lock is dropped.

So, let us split up __cpu_remove_dev() into 2 parts, say:
__cpu_remove_prepare() - invoked during CPU_DOWN_PREPARE
__cpu_remove_finish()  - invoked during CPU_POST_DEAD

In the former function, we stop the governors, so that policy->governor_enabled
gets set to false, so that patch [1/2] will return -EBUSY to any subsequent
->store() requests. Also, we do everything except the kobject cleanup.

In the latter function, we do the remaining work, particularly the part
where we wait for the kobject refcount to drop to zero and the subsequent
cleanup.

And the ->store() functions will be modified to look like this:

store()
{
	get_online_cpus();

	if (!cpu_online(cpu))
		goto out;

	/* Body of the function*/
out:
	put_online_cpus();
}

That way, if a task tries to write to a cpufreq file during CPU offline,
it will get blocked on get_online_cpus(), and will continue after
CPU_DEAD (since we release the lock here). Then it will notice that the cpu
is offline, and hence will return silently, thus dropping the kobject refcnt.
So, when the cpufreq core comes back at the CPU_POST_DEAD stage to cleanup
the kobject, it won't encounter any problems.

Any thoughts on this approach? I'll try to code it up and post the patch
later today.

Thank you!

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/