lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Mar 2020 17:53:56 +0000
From:   Valentin Schneider <valentin.schneider@....com>
To:     linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, mingo@...nel.org, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, morten.rasmussen@....com,
        mgorman@...hsingularity.net
Subject: Re: [PATCH] sched/topology: Fix overlapping sched_group build


On Tue, Mar 24 2020, Valentin Schneider wrote:
>  kernel/sched/topology.c | 23 ++++++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 8344757bba6e..7033b27e5162 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -866,7 +866,7 @@ build_balance_mask(struct sched_domain *sd, struct sched_group *sg, struct cpuma
>                       continue;
>
>               /* If we would not end up here, we can't continue from here */
> -		if (!cpumask_equal(sg_span, sched_domain_span(sibling->child)))
> +		if (!cpumask_subset(sg_span, sched_domain_span(sibling->child)))

So this is one source of issues; what I've done here is a bit stupid
since we include CPUs that *cannot* end up there. What I should've done
is something like:

  cpumask_and(tmp, sched_domain_span(sibling->child), sched_domain_span(sd));
  if (!cpumask_equal(sg_span, tmp))
      ...

But even with that I just unfold even more horrors: this breaks the
overlapping sched_group_capacity (see 1676330ecfa8 ("sched/topology: Fix
overlapping sched_group_capacity")).

For instance, here I would have

  CPU0-domain2-group4: span=4-5
  CPU4-domain2-group4: span=4-7 mask=4-5

Both groups are at the same topology level and have the same first CPU,
so they point to the same sched_group_capacity structure - but they
don't have the same span. They would without my "fix", but then the
group spans are back to being wrong. I'm starting to think this is
doomed, at least in the current state of things :/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ