lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51DE4F44.6060506@linux.vnet.ibm.com>
Date:	Thu, 11 Jul 2013 11:53:00 +0530
From:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To:	Lan Tianyu <lantianyu1986@...il.com>
CC:	Toralf Förster <toralf.foerster@....de>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	cpufreq@...r.kernel.org, Linux PM list <linux-pm@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Jarzmik, Robert" <robert.jarzmik@...el.com>,
	"R, Durgadoss" <durgadoss.r@...el.com>,
	Dirk Brandewie <dirk.brandewie@...il.com>, tianyu.lan@...el.com
Subject: Re: [PATCH] cpufreq: Fix cpufreq regression after suspend/resume

On 07/11/2013 11:10 AM, Lan Tianyu wrote:
> 2013/7/11 Srivatsa S. Bhat <srivatsa.bhat@...ux.vnet.ibm.com>:
>> Hi Toralf,
>>
>> On 07/11/2013 02:20 AM, Toralf Förster wrote:
>>> I tested the patch several times on top of a66b2e5 - the origin issue is
>>> fixed but - erratically another issue now appears : all 4 cores are runs
>>> after wakeup at 2.6 GHz.
>>> The temporary hot fix is to switch between governor performance and
>>> ondemand for all 4 cores.
>>>
>>>
>>
>> Thanks for all your testing efforts. The commit a66b2e5 took a shortcut to
>> achieve its goals but failed subtly in many aspects. Below is a patch (only
>> compile-tested) which tries to achieve the goal (preserve sysfs files) in
>> the proper way, by separating out the cpufreq-core bits from the sysfs bits.
>> You might want to give it a try on current mainline and see if it improves
>> anything.
>>
>> Regards,
>> Srivatsa S. Bhat
>>
>>
>> (Applies on current mainline)
>> ---
>>
>>  drivers/cpufreq/cpufreq.c |  144 ++++++++++++++++++++++++++-------------------
>>  1 file changed, 82 insertions(+), 62 deletions(-)
>>
[...]
>> +/**
>> + * cpufreq_add_dev - add a CPU device
>> + *
>> + * Adds the cpufreq interface for a CPU device.
>> + *
>> + * The Oracle says: try running cpufreq registration/unregistration concurrently
>> + * with with cpu hotplugging and all hell will break loose. Tried to clean this
>> + * mess up, but more thorough testing is needed. - Mathieu
>> + */
>> +static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>> +{
>> +       return __cpufreq_add_dev(dev, sif, true);
>> +}
>> +
>>  static void update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu)
>>  {
>>         int j;
>> @@ -1106,7 +1116,7 @@ static void update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu)
>>   * This routine frees the rwsem before returning.
>>   */
>>  static int __cpufreq_remove_dev(struct device *dev,
>> -               struct subsys_interface *sif)
>> +                               struct subsys_interface *sif, bool remove_sysfs)
>>  {
>>         unsigned int cpu = dev->id, ret, cpus;
>>         unsigned long flags;
>> @@ -1145,9 +1155,9 @@ static int __cpufreq_remove_dev(struct device *dev,
>>                 cpumask_clear_cpu(cpu, data->cpus);
>>         unlock_policy_rwsem_write(cpu);
>>
>> -       if (cpu != data->cpu) {
>> +       if (cpu != data->cpu && remove_sysfs) {
>>                 sysfs_remove_link(&dev->kobj, "cpufreq");
>> -       } else if (cpus > 1) {
>> +       } else if (cpus > 1 && remove_sysfs) {
>>                 /* first sibling now owns the new sysfs dir */
>>                 cpu_dev = get_cpu_device(cpumask_first(data->cpus));
>>                 sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
>> @@ -1184,26 +1194,25 @@ static int __cpufreq_remove_dev(struct device *dev,
>>
>>         /* If cpu is last user of policy, free policy */
>>         if (cpus == 1) {
>> -               lock_policy_rwsem_read(cpu);
>> -               kobj = &data->kobj;
>> -               cmp = &data->kobj_unregister;
>> -               unlock_policy_rwsem_read(cpu);
>> -               kobject_put(kobj);
>> -
>> -               /* we need to make sure that the underlying kobj is actually
>> -                * not referenced anymore by anybody before we proceed with
>> -                * unloading.
>> -                */
>> -               pr_debug("waiting for dropping of refcount\n");
>> -               wait_for_completion(cmp);
>> -               pr_debug("wait complete\n");
>> -
>>                 if (cpufreq_driver->exit)
>>                         cpufreq_driver->exit(data);
>>
>>                 free_cpumask_var(data->related_cpus);
>>                 free_cpumask_var(data->cpus);
>>                 kfree(data);
>> +
>> +               if (remove_sysfs) {
>> +                       lock_policy_rwsem_read(cpu);
>> +                       kobj = &data->kobj;
>> +                       cmp = &data->kobj_unregister;
> 
> This looks no right.  "data" has already been freed but data-> kobj still is
> referenced.
> 

Oops! You are right. Hmm, this looks quite difficult to get right :(
There are multiple challenges here:

1. The sysfs files must not be removed during cpu_down, and not initialized

   during cpu_up. That would help us preserve the file permissions.
2. But we should ensure that we really do the cpufreq-core parts of the cpu
   initialization during cpu_up. If we fail to free some of the data-structures
   during cpu_down, the cpu_up callback will think that a full-init is not
   required and not do its job. That will make cpufreq behave erratically after
   suspend/resume and take us back to square one.

3. A full re-init in the cpu_up callback also involves memory allocations.
   So if we don't release the memory in the cpu_down callback, we'll end up
   in a memory leak.

I tried to address all these in this patch, but you found yet another serious
loop-hole. I guess I'm out of ideas now... if anybody has any thoughts on how
to get this right, then I'm all ears. Else, we'll just revert the original
commit like Rafael suggested and leave it upto userspace to save and restore
the permissions across suspend/resume if it wants ;-(

>> +                       unlock_policy_rwsem_read(cpu);
>> +                       kobject_put(kobj);
>> +
>> +                       pr_debug("waiting for dropping of refcount\n");
>> +                       wait_for_completion(cmp);
>> +                       pr_debug("wait complete\n");
>> +               }
>> +
>>         } else if (cpufreq_driver->target) {
>>                 __cpufreq_governor(data, CPUFREQ_GOV_START);
>>                 __cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
>> @@ -1221,7 +1230,7 @@ static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
>>         if (cpu_is_offline(cpu))
>>                 return 0;
>>
>> -       retval = __cpufreq_remove_dev(dev, sif);
>> +       retval = __cpufreq_remove_dev(dev, sif, true);
>>         return retval;
>>  }

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ