linux-kernel - Re: [PATCH] sched/topology: Assert non-NUMA topology masks don't (partially) overlap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200116104428.GP2827@hirez.programming.kicks-ass.net>
Date:   Thu, 16 Jan 2020 11:44:28 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     linux-kernel@...r.kernel.org, sudeep.holla@....com,
        prime.zeng@...ilicon.com, dietmar.eggemann@....com,
        morten.rasmussen@....com, mingo@...nel.org
Subject: Re: [PATCH] sched/topology: Assert non-NUMA topology masks don't
 (partially) overlap

On Wed, Jan 15, 2020 at 04:09:15PM +0000, Valentin Schneider wrote:
> A "less intrusive" alternative is to assert the sd->groups list doesn't get
> re-written, which is a symptom of such bogus topologies. I've briefly
> tested this, you can have a look at it here:
> 
>   http://www.linux-arm.org/git?p=linux-vs.git;a=commit;h=e0ead72137332cbd3d69c9055ab29e6ffae5b37b

Something like that might still make sense. Can't never be too careful,
right ;-)

>  kernel/sched/topology.c | 39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 6ec1e595b1d4..dfb64c08a407 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1879,6 +1879,42 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve
>  	return sd;
>  }
>  
> +/*
> + * Ensure topology masks are sane, i.e. there are no conflicts (overlaps) for
> + * any two given CPUs at this (non-NUMA) topology level.
> + */
> +static bool topology_span_sane(struct sched_domain_topology_level *tl,
> +			      const struct cpumask *cpu_map, int cpu)
> +{
> +	int i;
> +
> +	/* NUMA levels are allowed to overlap */
> +	if (tl->flags & SDTL_OVERLAP)
> +		return true;
> +
> +	/*
> +	 * Non-NUMA levels cannot partially overlap - they must be either
> +	 * completely equal or completely disjoint. Otherwise we can end up
> +	 * breaking the sched_group lists - i.e. a later get_group() pass
> +	 * breaks the linking done for an earlier span.
> +	 */
> +	for_each_cpu(i, cpu_map) {
> +		if (i == cpu)
> +			continue;
> +		/*
> +		 * We should 'and' all those masks with 'cpu_map' to exactly
> +		 * match the topology we're about to build, but that can only
> +		 * remove CPUs, which only lessens our ability to detect
> +		 * overlaps
> +		 */
> +		if (!cpumask_equal(tl->mask(cpu), tl->mask(i)) &&
> +		    cpumask_intersects(tl->mask(cpu), tl->mask(i)))
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
>  /*
>   * Find the sched_domain_topology_level where all CPU capacities are visible
>   * for all CPUs.
> @@ -1975,6 +2011,9 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
>  				has_asym = true;
>  			}
>  
> +			if (WARN_ON(!topology_span_sane(tl, cpu_map, i)))
> +				goto error;
> +
>  			sd = build_sched_domain(tl, cpu_map, attr, sd, dflags, i);
>  
>  			if (tl == sched_domain_topology)

This is O(nr_cpus), but then, that function already is, so I don't see a
problem with this.

I'll take it, thanks!