lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 29 May 2015 17:03:57 -0400
From:	Josef Bacik <jbacik@...com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	<riel@...hat.com>, <mingo@...hat.com>,
	<linux-kernel@...r.kernel.org>, <umgwanakikbuti@...il.com>,
	<morten.rasmussen@....com>, kernel-team <Kernel-team@...com>
Subject: Re: [PATCH RESEND] sched: prefer an idle cpu vs an idle sibling for
 BALANCE_WAKE

On 05/28/2015 07:05 AM, Peter Zijlstra wrote:
>
> So maybe you want something like the below; that cures the thing Morten
> raised, and we continue looking for sd, even after we found affine_sd.
>
> It also avoids the pointless idle_cpu() check Mike raised by making
> select_idle_sibling() return -1 if it doesn't find anything.
>
> Then it continues doing the full balance IFF sd was set, which is keyed
> off of sd->flags.
>
> And note (as Mike already said), BALANCE_WAKE does _NOT_ look for idle
> CPUs, it looks for the least loaded CPU. And its damn expensive.
>
>
> Rewriting this entire thing is somewhere on the todo list :/
>

Summarizing what I've found so far.

-We turn on SD_BALANCE_WAKE on our domains for our 3.10 boxes, but not 
for our 4.0 boxes (due to some weird configuration issue.)
-Running with this patch is better than plain 4.0 but not as good as my 
patch, running with SD_BALANCE_WAKE set and not set makes no difference 
to the runs.
-I took out the sd = NULL; bit from the affine case like you said on IRC 
and I get similar results as before.
-I'm thoroughly confused as to why my patch did anything since we 
weren't turning on SD_BALANCE_WAKE on 4.0 in my previous runs (I assume, 
it isn't set now so I'm pretty sure the problem has always been there) 
so we should have always had sd == NULL which means we would have only 
ever gotten the task cpu I guess.

Now I'm looking at the code in select_idle_sibling and we do this

for_each_lower_domain(sd) {
         sg = sd->groups;
         do {
                 if (!cpumask_intersects(sched_group_cpus(sg),
                                         tsk_cpus_allowed(p)))
                         goto next;

                 for_each_cpu(i, sched_group_cpus(sg)) {
                         if (i == target || !idle_cpu(i))
                                 goto next;
                 }

                 return cpumask_first_and(sched_group_cpus(sg),
                                 tsk_cpus_allowed(p));
next:
                 sg = sg->next
         } while (sg != sd->groups);
}

We get all the schedule groups for the schedule domain and if any of the 
cpu's are not idle or the target then we skip the whole scheduling 
group.  Isn't the scheduling group a group of CPU's?  Why can't we pick 
an idle CPU in the group that has a none idle cpu or the target cpu? 
Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ