lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a16ade5e-2584-af75-ac60-c352edf1d7ec@linaro.org>
Date:   Tue, 13 Mar 2018 20:15:36 +0100
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     Viresh Kumar <viresh.kumar@...aro.org>
Cc:     edubezval@...il.com, kevin.wangtao@...aro.org, leo.yan@...aro.org,
        vincent.guittot@...aro.org, amit.kachhap@...il.com,
        linux-kernel@...r.kernel.org, javi.merino@...nel.org,
        rui.zhang@...el.com, daniel.thompson@...aro.org,
        linux-pm@...r.kernel.org
Subject: Re: [PATCH V2 6/7] thermal/drivers/cpu_cooling: Introduce the cpu
 idle cooling driver

On 26/02/2018 05:30, Viresh Kumar wrote:

[ ... ]

>>>> +		/*
>>>> +		 * The last CPU waking up is in charge of setting the
>>>> +		 * timer. If the CPU is hotplugged, the timer will
>>>> +		 * move to another CPU (which may not belong to the
>>>> +		 * same cluster) but that is not a problem as the
>>>> +		 * timer will be set again by another CPU belonging to
>>>> +		 * the cluster, so this mechanism is self adaptive and
>>>> +		 * does not require any hotplugging dance.
>>>> +		 */
>>>
>>> Well this depends on how CPU hotplug really happens. What happens to
>>> the per-cpu-tasks which are in the middle of something when hotplug
>>> happens?  Does hotplug wait for those per-cpu-tasks to finish ?
> 
> Missed this one ?

To be frank, I don't know. I have been through the hp code and didn't
find the answer. AFAICT, the sched/core.c sched_cpu_dying() disables the
rq and then migrates the tasks. I assume this kthread is not migrated
and stays in sleeping state until the rq is online again. As the wakeup
callback discards offline cpus, the kthread is not wakeup at any moment.

I created a situation where that happens: mitigation (raised with
dhrystone) + cpu hotplugging (offline/online loop) and never hit any issue.

However, I'm wondering if using the 'struct smp_hotplug_thread'
infra-stucture (git show f97f8f06) shouldn't be used.


>>>> +int cpuidle_cooling_register(void)
>>>> +{
>>>> +	struct cpuidle_cooling_device *idle_cdev = NULL;
>>>> +	struct thermal_cooling_device *cdev;
>>>> +	struct cpuidle_cooling_tsk *cct;
>>>> +	struct task_struct *tsk;
>>>> +	struct device_node *np;
>>>> +	cpumask_t *cpumask;
>>>> +	char dev_name[THERMAL_NAME_LENGTH];
>>>> +	int ret = -ENOMEM, cpu;
>>>> +	int index = 0;
>>>> +
>>>> +	for_each_possible_cpu(cpu) {
>>>> +		cpumask = topology_core_cpumask(cpu);
>>>> +
>>>> +		cct = per_cpu_ptr(&cpuidle_cooling_tsk, cpu);
>>>> +
>>>> +		/*
>>>> +		 * This condition makes the first cpu belonging to the
>>>> +		 * cluster to create a cooling device and allocates
>>>> +		 * the structure. Others CPUs belonging to the same
>>>> +		 * cluster will just increment the refcount on the
>>>> +		 * cooling device structure and initialize it.
>>>> +		 */
>>>> +		if (cpu == cpumask_first(cpumask)) {
>>>
>>> Your function still have few assumptions of cpu numbering and it will
>>> break in few cases. What if the CPUs on a big Little system (4x4) are
>>> present in this order: B L L L L B B B  ??
>>>
>>> This configuration can happen if CPUs in DT are marked as: 0-3 LITTLE,
>>> 4-7 big and a big CPU is used by the boot loader to bring up Linux.
>>
>> Ok, how can I sort it out ?
> 
> I would do something like this:
> 
>         cpumask_copy(possible, cpu_possible_mask);
>         
>         while (!cpumask_empty(possible)) {
>                 first = cpumask_first(possible);
>                 cpumask = topology_core_cpumask(first);
>                 cpumask_andnot(possible, possible, cpumask);
>         
>                 allocate_cooling_dev(first); //This is most of this function in your patch.
>         
>                 while (!cpumask_empty(cpumask)) {
>                         temp = cpumask_first(possible);
>                         //rest init "temp"
>                         cpumask_clear_cpu(temp, cpumask);
>                 }
>         
>                 //Everything done, register cooling device for cpumask.
>         }
> 
>>>> +			np = of_cpu_device_node_get(cpu);
>>>> +
>>>> +			idle_cdev = kzalloc(sizeof(*idle_cdev), GFP_KERNEL);
>>>> +			if (!idle_cdev)
>>>> +				goto out_fail;
>>>> +
>>>> +			idle_cdev->idle_cycle = DEFAULT_IDLE_TIME_US;
>>>> +
>>>> +			atomic_set(&idle_cdev->count, 0);
>>>
>>> This should already be 0, isn't it ?
>>
>> Yes.
> 
> I read it as, "I will drop it" :)
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ