lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 16 Jul 2013 14:26:52 +0530
From:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To:	Viresh Kumar <viresh.kumar@...aro.org>
CC:	rjw@...k.pl, toralf.foerster@....de, robert.jarzmik@...el.com,
	durgadoss.r@...el.com, tianyu.lan@...el.com,
	lantianyu1986@...il.com, dirk.brandewie@...il.com,
	stern@...land.harvard.edu, linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 7/8] cpufreq: Preserve policy structure across suspend/resume

On 07/16/2013 11:45 AM, Viresh Kumar wrote:
> On 15 July 2013 15:35, Srivatsa S. Bhat
> <srivatsa.bhat@...ux.vnet.ibm.com> wrote:
>> Actually even I was wondering about this while writing the patch and
>> I even tested shutdown after multiple suspend/resume cycles, to verify that
>> the refcount is messed up. But surprisingly, things worked just fine.
>>
>> Logically there should've been a refcount mismatch and things should have
>> failed, but everything worked fine during my tests. Apart from suspend/resume
>> and shutdown tests, I even tried mixing a few regular CPU hotplug operations
>> (echo 0/1 to sysfs online files), but nothing stood out.
>>
>> Sorry, I forgot to document this in the patch. Either the patch is wrong
>> or something else is silently fixing this up. Not sure what is the exact
>> situation.
> 
> To understand it I actually applied your patches to get better view of the code.
> (Haven't tested it though).. And found that your code is doing the right thing
> and we shouldn't get a mismatch.. This is the sequence of events I can draw:
> 
> - __cpu_add_dev() for first cpu. sets the refcount to 'x', where x are
> the no. of
> cpus in its clock domain.
> - _cpu_add_dev() for other cpus: doesn't change anything in refcount
> 
> - Suspend:
>  - cpu_remove_dev() for all cpus, due to frozen flag we don't touch the value
> of count
> - Resume:
>  - cpu_add_dev() for all cpus, due to frozen flag we don't touch the
> value of count.
>

Actually this one is tricky (I took a look again). So we have this code in the
beginning of _cpufreq_add_dev():


1008 #ifdef CONFIG_SMP
1009         /* check whether a different CPU already registered this
1010          * CPU because it is in the same boat. */
1011         policy = cpufreq_cpu_get(cpu);
1012         if (unlikely(policy)) {
1013                 cpufreq_cpu_put(policy);
1014                 return 0;
1015         }

The _get() is not controlled by the frozen flag, but it still doesn't take a
refcount because of a subtle reason: per_cpu(cpufreq_cpu_data, cpu) was set to
NULL in __cpufreq_remove_dev() and the memory was saved away in fallback storage.
So, when __cpufreq_cpu_get() executes, it sees:

 204         /* get the CPU */
 205         data = per_cpu(cpufreq_cpu_data, cpu);
 206 
 207         if (!data)
 208                 goto err_out_put_module;

Thus, since data is NULL, cpufreq_cpu_get() won't take a refcount and will return
silently.

Further down in __cpufreq_add_dev(), we restore the original memory, using
the frozen flag:

1037         if (frozen)
1038                 /* Restore the saved policy when doing light-weight init */
1039                 policy = cpufreq_policy_restore(cpu);
1040         else
1041                 policy = cpufreq_policy_alloc();


So that is how we manage to fool cpufreq_cpu_get() into not taking a fresh
refcount while resuming :)
 
> And so things work as expected. That's why your code isn't breaking anything I
> believe.
> 

Thanks a lot for the code inspection and your detailed analysis!

> But can no. of cpus change inbetween suspend/resume? Then count would be
> tricky as we are using the same policy structure.
> 

No, number of CPUs won't change in between suspend/resume. Even if somebody
tried that, that would be an eccentric case and we won't handle that.
Besides, *many more* things will break than just cpufreq, if somebody actually
tries that out!

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ