lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 02 Jul 2020 15:31:07 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Quentin Perret <qperret@...gle.com>
Cc:     linux-kernel@...r.kernel.org, mingo@...nel.org,
        peterz@...radead.org, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, morten.rasmussen@....com
Subject: Re: [PATCH v3 2/7] sched/topology: Define and assign sched_domain flag metadata


Hi Quentin,

Thanks for having a look!

On 02/07/20 13:15, Quentin Perret wrote:
> Hey Valentin,
>
> On Wednesday 01 Jul 2020 at 20:06:50 (+0100), Valentin Schneider wrote:
>> +/*
>> + * Domain members have different CPU capacities
>> + *
>> + * SHARED_PARENT: Set from the topmost domain down to the first domain where
>> + * asymmetry is detected.
>> + */
>> +SD_FLAG(SD_ASYM_CPUCAPACITY,    5, SDF_SHARED_PARENT)
>
> Probably not a huge deal, but I don't think this is _exactly_ how
> SD_ASYM_CPUCAPACITY was defined originally, nor how the topology
> detection code deals with it (IIRC).
>
> That is, IIRC Morten defined SD_ASYM_CPUCAPACITY as the _lowest_ domain
> at which all CPU capacities are visible. On all real systems I can think
> of, this domain also happens to be the topmost domain, so that might not
> be a huge deal and we can probably change that definition to the one you
> suggest. But we should perhaps make the matching changes to
> asym_cpu_capacity_level()/build_sched_domains() if we're going down that
> path?
>

You're spot on, I was trying to sweep some of it under the carpet...

There an "interesting" quirk of asym_cpu_capacity_level() in that it does
something slightly different than what it says on the tin: it detects
the lowest topology level where *the biggest* CPU capacity is visible by
all CPUs. That works just fine on big.LITTLE, but there are questionable
DynamIQ topologies that could hit some issues.

Consider:

DIE [                   ]
MC  [             ][    ] <- sd_asym_cpucapacity
     0   1   2   3  4  5
     L   L   B   B  B  B

asym_cpu_capacity_level() would pick MC as the asymmetric topology level,
and you can argue either way: it should be DIE, because that's where CPUs 4
and 5 can see a LITTLE, or it should be MC, at least for CPUs 0-3 because
there they see all CPU capacities.

I have a plan on how to fix that, but I haven't been made aware of any
"real" topology that would seriously break there. The moment one does, this
will surface up to the top of my todo-list.

In the meantime, we can make it match the SDF_SHARED_PARENT semantics, and
this actually fixes an issue with solo big CPU clusters (which I
anecdotally found out while first writing this series, and forgot to
include):

--->8
From: Valentin Schneider <valentin.schneider@....com>
Date: Wed, 16 Oct 2019 18:12:12 +0100
Subject: [PATCH 1/1] sched/topology: Propagate SD_ASYM_CPUCAPACITY upwards

We currently set this flag *only* on domains whose topology level exactly
match the level where we detect asymmetry (as returned by
asym_cpu_capacity_level()). This is rather problematic.

Say there are two clusters in the system, one with a lone big CPU and the
other with a mix of big and LITTLE CPUs:

DIE [                ]
MC  [             ][ ]
     0   1   2   3  4
     L   L   B   B  B

asym_cpu_capacity_level() will figure out that the MC level is the one
where all CPUs can see a CPU of max capacity, and we will thus set
SD_ASYM_CPUCAPACITY at MC level for all CPUs.

That lone big CPU will degenerate its MC domain, since it would be alone in
there, and will end up with just a DIE domain. Since the flag was only set
at MC, this CPU ends up not seeing any SD with the flag set, which is
broken.

Rather than clearing dflags at every topology level, clear it before
entering the topology level loop. This will properly propagate upwards
flags that are set starting from a certain level.

Signed-off-by: Valentin Schneider <valentin.schneider@....com>
---
 kernel/sched/topology.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index b5667a273bf6..549268249645 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1965,11 +1965,10 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
        /* Set up domains for CPUs specified by the cpu_map: */
        for_each_cpu(i, cpu_map) {
                struct sched_domain_topology_level *tl;
+		int dflags = 0;

                sd = NULL;
                for_each_sd_topology(tl) {
-			int dflags = 0;
-
                        if (tl == tl_asym) {
                                dflags |= SD_ASYM_CPUCAPACITY;
                                has_asym = true;
--
2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ