lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 7 Dec 2012 09:33:09 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Alex Shi <alex.shi@...el.com>
Cc:	npiggin@...nel.dk, mingo@...hat.com, peterz@...radead.org,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	Mike Galbraith <efault@....de>
Subject: Re: [PATCH 02/10] sched: fix find_idlest_group mess logical

2012/12/7 Alex Shi <alex.shi@...el.com>:
> On 12/07/2012 08:56 AM, Frederic Weisbecker wrote:
>> 2012/12/3 Alex Shi <alex.shi@...el.com>:
>>> There is 4 situations in the function:
>>> 1, no task allowed group;
>>>         so min_load = ULONG_MAX, this_load = 0, idlest = NULL
>>> 2, only local group task allowed;
>>>         so min_load = ULONG_MAX, this_load assigned, idlest = NULL
>>> 3, only non-local task group allowed;
>>>         so min_load assigned, this_load = 0, idlest != NULL
>>> 4, local group + another group are task allowed.
>>>         so min_load assigned, this_load assigned, idlest != NULL
>>>
>>> Current logical will return NULL in first 3 kinds of scenarios.
>>> And still return NULL, if idlest group is heavier then the
>>> local group in the 4th situation.
>>>
>>> Actually, I thought groups in situation 2,3 are also eligible to host
>>> the task. And in 4th situation, agree to bias toward local group.
>>> So, has this patch.
>>
>> The way I understand the loop that use this in select_task_rq_fair() is:
>>
>> a) start from the highest domain level we are allowed to run to
>> migrate the task in
>> b) from that top level domain, find the idlest group. If the idlest
>> group contains current CPU, zoom in the child domain and repeat b). If
>> the idlest group doesn't contain the current CPU, pick the idlest CPU
>> from that group.
>> c) In the end if we found no idler target than current CPU, then take it.
>>
>> So if you also return a group that contains current CPU from
>> find_idlest_group(), you don't recursively zoom in the child domain
>> anymore. find_idlest_cpu() will fix that for you but it may come with
>> some cost because now it iterates through every CPUs, or may be half
>> of them.
>
> Not exactly, the old logical won't select cpu from group of situation 2
> and 3. That is wrong. and may cause the task keep stay on prev_cpu even
> there are still other idler and allowed cpu exist.

For situation 2 I don't understand the issue. If current CPU belong to
idlest group we want to zoom in our lookup until we find something an
idler group than the current CPU's? If we eventually don't find it,
then we fallback to current CPU, don't we?

I just have a doubt to express. How does the final leaf child domain
look like? Is it made of current CPU only or can it contain other
siblings? In the first case we are fine. In the second one, if this
domain is made of only one group of several CPUs, we are skipping the
find_idlest_cpu() call for that group and choose this_cpu by default.
Which is probably suboptimized?

Concerning situation 3, if this_cpu is not a CPU allowed by the task,
we may indeed have an issue because find_idlest_group() doesn't seem
to be selecting non-local groups in this case. But your current fix
still breaks the recursive find_idlest_group() on other cases and may
not scale with big number of CPUs.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ