lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200702154514.GA1072702@google.com>
Date:   Thu, 2 Jul 2020 16:45:14 +0100
From:   Quentin Perret <qperret@...gle.com>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     linux-kernel@...r.kernel.org, mingo@...nel.org,
        peterz@...radead.org, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, morten.rasmussen@....com
Subject: Re: [PATCH v3 2/7] sched/topology: Define and assign sched_domain
 flag metadata

On Thursday 02 Jul 2020 at 15:31:07 (+0100), Valentin Schneider wrote:
> There an "interesting" quirk of asym_cpu_capacity_level() in that it does
> something slightly different than what it says on the tin: it detects
> the lowest topology level where *the biggest* CPU capacity is visible by
> all CPUs. That works just fine on big.LITTLE, but there are questionable
> DynamIQ topologies that could hit some issues.
> 
> Consider:
> 
> DIE [                   ]
> MC  [             ][    ] <- sd_asym_cpucapacity
>      0   1   2   3  4  5
>      L   L   B   B  B  B
> 
> asym_cpu_capacity_level() would pick MC as the asymmetric topology level,
> and you can argue either way: it should be DIE, because that's where CPUs 4
> and 5 can see a LITTLE, or it should be MC, at least for CPUs 0-3 because
> there they see all CPU capacities.

Right, I am not looking forward to these topologies...

> I have a plan on how to fix that, but I haven't been made aware of any
> "real" topology that would seriously break there. The moment one does, this
> will surface up to the top of my todo-list.
> 
> In the meantime, we can make it match the SDF_SHARED_PARENT semantics, and
> this actually fixes an issue with solo big CPU clusters (which I
> anecdotally found out while first writing this series, and forgot to
> include):
> 
> --->8
> From: Valentin Schneider <valentin.schneider@....com>
> Date: Wed, 16 Oct 2019 18:12:12 +0100
> Subject: [PATCH 1/1] sched/topology: Propagate SD_ASYM_CPUCAPACITY upwards
> 
> We currently set this flag *only* on domains whose topology level exactly
> match the level where we detect asymmetry (as returned by
> asym_cpu_capacity_level()). This is rather problematic.
> 
> Say there are two clusters in the system, one with a lone big CPU and the
> other with a mix of big and LITTLE CPUs:
> 
> DIE [                ]
> MC  [             ][ ]
>      0   1   2   3  4
>      L   L   B   B  B
> 
> asym_cpu_capacity_level() will figure out that the MC level is the one
> where all CPUs can see a CPU of max capacity, and we will thus set
> SD_ASYM_CPUCAPACITY at MC level for all CPUs.
> 
> That lone big CPU will degenerate its MC domain, since it would be alone in
> there, and will end up with just a DIE domain. Since the flag was only set
> at MC, this CPU ends up not seeing any SD with the flag set, which is
> broken.

+1

> Rather than clearing dflags at every topology level, clear it before
> entering the topology level loop. This will properly propagate upwards
> flags that are set starting from a certain level.

I'm feeling a bit nervous about that asymmetry -- in your example
select_idle_capacity() on, say, CPU3 will see less CPUs than on CPU4.
So, you might get fun side-effects where all task migrated to CPUs 0-3
will be 'stuck' there while CPU 4 stays mostly idle.

I have a few ideas to avoid that (e.g. looking at the rd span in
select_idle_capacity() instead of sd_asym_cpucapacity) but all this is
theoretical, so I'm happy to wait for a real platform to be released
before we worry too much about it.

In the meantime:

Reviewed-by: Quentin Perret <qperret@...gle.com>
> Signed-off-by: Valentin Schneider <valentin.schneider@....com>
> ---
>  kernel/sched/topology.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index b5667a273bf6..549268249645 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1965,11 +1965,10 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
>         /* Set up domains for CPUs specified by the cpu_map: */
>         for_each_cpu(i, cpu_map) {
>                 struct sched_domain_topology_level *tl;
> +		int dflags = 0;
> 
>                 sd = NULL;
>                 for_each_sd_topology(tl) {
> -			int dflags = 0;
> -
>                         if (tl == tl_asym) {
>                                 dflags |= SD_ASYM_CPUCAPACITY;
>                                 has_asym = true;
> --
> 2.27.0

Thanks,
Quentin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ