lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f3f8e33f-ffba-8724-86b6-f5a9689bb9f3@redhat.com>
Date:   Tue, 25 Apr 2017 11:33:51 -0300
From:   Lauro Venancio <lvenanci@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     lwang@...hat.com, riel@...hat.com, Mike Galbraith <efault@....de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/4] sched/topology: the group balance cpu must be a cpu
 where the group is installed

On 04/25/2017 09:17 AM, Peter Zijlstra wrote:
> On Mon, Apr 24, 2017 at 12:11:59PM -0300, Lauro Venancio wrote:
>> On 04/24/2017 10:03 AM, Peter Zijlstra wrote:
>>> On Thu, Apr 20, 2017 at 04:51:43PM -0300, Lauro Ramos Venancio wrote:
>>>
>>>> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
>>>> index e77c93a..694e799 100644
>>>> --- a/kernel/sched/topology.c
>>>> +++ b/kernel/sched/topology.c
>>>> @@ -505,7 +507,11 @@ static void build_group_mask(struct sched_domain *sd, struct sched_group *sg)
>>>>  
>>>>  	for_each_cpu(i, sg_span) {
>>>>  		sibling = *per_cpu_ptr(sdd->sd, i);
>>>> -		if (!cpumask_test_cpu(i, sched_domain_span(sibling)))
>>>> +		if (!cpumask_equal(sg_span, sched_group_cpus(sibling->groups)))
>>>>  			continue;
> Hmm _this_ is what requires us to move the thing to a whole separate
> iteration. Because when we build the groups, the domains are already
> constructed, so that was right.
>
> So the moving crud around wasn't the primary fix, this is.
>
> With the fact that sched_group_cpus(sd->groups) ==
> sched_domain_span(sibling->child) (if child exists) established in the
> previous patches, could we not simplify this like the below?
We can. We just need to better handle the case when there is no child or
we will have empty masks.
We have to replicate the build_group_from_child_sched_domain() behavior:

if (sd->child)
    cpumask_copy(sg_span, sched_domain_span(sd->child));
else
    cpumask_copy(sg_span, sched_domain_span(sd));


So we need something like:


if (sibling->child)
    gsd = sibling->child;
else
    gsd = sibling;

if (!cpumask_equal(sg_span, sched_domain_span(gsd)))

	continue;


>
> ---
> Subject: sched/topology: Fix overlapping sched_group_mask
> From: Peter Zijlstra <peterz@...radead.org>
> Date: Tue Apr 25 14:00:49 CEST 2017
>
> The point of sched_group_mask is to select those CPUs from
> sched_group_cpus that can actually arrive at this balance domain.
>
> The current code gets it wrong, as can be readily demonstrated with a
> topology like:
>
>   node   0   1   2   3
>     0:  10  20  30  20
>     1:  20  10  20  30
>     2:  30  20  10  20
>     3:  20  30  20  10
>
> Where (for example) domain 1 on CPU1 ends up with a mask that includes
> CPU0:
>
>   [] CPU1 attaching sched-domain:
>   []  domain 0: span 0-2 level NUMA
>   []   groups: 1 (mask: 1), 2, 0
>   []   domain 1: span 0-3 level NUMA
>   []    groups: 0-2 (mask: 0-2) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)
>
> This causes sched_balance_cpu() to compute the wrong CPU and
> consequently should_we_balance() will terminate early resulting in
> missed load-balance opportunities.
>
> The fixed topology looks like:
>
>   [] CPU1 attaching sched-domain:
>   []  domain 0: span 0-2 level NUMA
>   []   groups: 1 (mask: 1), 2, 0
>   []   domain 1: span 0-3 level NUMA
>   []    groups: 0-2 (mask: 1) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)
>
> Debugged-by: Lauro Ramos Venancio <lvenanci@...hat.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
>  kernel/sched/topology.c |   11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -495,6 +495,9 @@ enum s_alloc {
>  /*
>   * Build an iteration mask that can exclude certain CPUs from the upwards
>   * domain traversal.
> + *
> + * Only CPUs that can arrive at this group should be considered to continue
> + * balancing.
>   */
>  static void build_group_mask(struct sched_domain *sd, struct sched_group *sg)
>  {
> @@ -505,7 +508,13 @@ static void build_group_mask(struct sche
>  
>  	for_each_cpu(i, sg_span) {
>  		sibling = *per_cpu_ptr(sdd->sd, i);
> -		if (!cpumask_test_cpu(i, sched_domain_span(sibling)))
> +
> +		/* overlap should have children; except for FORCE_SD_OVERLAP */
> +		if (WARN_ON_ONCE(!sibling->child))
> +			continue;
> +
> +		/* If we would not end up here, we can't continue from here */
> +		if (!cpumask_equal(sg_span, sched_domain_span(sibling->child)))
>  			continue;
>  
>  		cpumask_set_cpu(i, sched_group_mask(sg));
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ