linux-kernel - Re: [PATCH] cpu-topology: warn if NUMA configurations conflicts with lower layer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1a8f7963-97e9-62cc-12d2-39f816dfaf67@arm.com>
Date:   Sat, 11 Jan 2020 20:56:28 +0000
From:   Valentin Schneider <valentin.schneider@....com>
To:     "Zengtao (B)" <prime.zeng@...ilicon.com>,
        Morten Rasmussen <morten.rasmussen@....com>
Cc:     Sudeep Holla <sudeep.holla@....com>,
        Linuxarm <linuxarm@...wei.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] cpu-topology: warn if NUMA configurations conflicts with
 lower layer

On 09/01/2020 12:58, Zengtao (B) wrote:
>> IIUC, the problem is that virt can set up a broken topology in some
>> cases where MPIDR doesn't line up correctly with the defined NUMA
>> nodes.
>>
>> We could argue that it is a qemu/virt problem, but it would be nice if
>> we could at least detect it. The proposed patch isn't really the right
>> solution as it warns on some valid topologies as Sudeep already pointed
>> out.
>>
>> It sounds more like we need a mask subset check in the sched_domain
>> building code, if there isn't already one?
> 
> Currently no, it's a bit complex to do the check in the sched_domain building code,
> I need to take a think of that.
> Suggestion welcomed.
> 

Doing a search on the sched_domain spans themselves should look something like
the completely untested:

---8<---
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 6ec1e595b1d4..96128d12ec23 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1879,6 +1879,43 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve
 	return sd;
 }
 
+/* Ensure topology masks are sane; non-NUMA spans shouldn't overlap */
+static int validate_topology_spans(const struct cpumask *cpu_map)
+{
+	struct sched_domain_topology_level *tl;
+	int i, j;
+
+	for_each_sd_topology(tl) {
+		/* NUMA levels are allowed to overlap */
+		if (tl->flags & SDTL_OVERLAP)
+			break;
+
+		/*
+		 * Non-NUMA levels cannot partially overlap - they must be
+		 * either equal or wholly disjoint. Otherwise we can end up
+		 * breaking the sched_group lists - i.e. a later get_group()
+		 * pass breaks the linking done for an earlier span.
+		 */
+		for_each_cpu(i, cpu_map) {
+			for_each_cpu(j, cpu_map) {
+				if (i == j)
+					continue;
+				/*
+				 * We should 'and' all those masks with 'cpu_map'
+				 * to exactly match the topology we're about to
+				 * build, but that can only remove CPUs, which
+				 * only lessens our ability to detect overlaps
+				 */
+				if (!cpumask_equal(tl->mask(i), tl->mask(j)) &&
+				    cpumask_intersects(tl->mask(i), tl->mask(j)))
+					return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
 /*
  * Find the sched_domain_topology_level where all CPU capacities are visible
  * for all CPUs.
@@ -1953,7 +1990,8 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
 	struct sched_domain_topology_level *tl_asym;
 	bool has_asym = false;
 
-	if (WARN_ON(cpumask_empty(cpu_map)))
+	if (WARN_ON(cpumask_empty(cpu_map)) ||
+	    WARN_ON(validate_topology_spans(cpu_map)))
 		goto error;
 
 	alloc_state = __visit_domain_allocation_hell(&d, cpu_map);
--->8---

Alternatively the assertion on the sched_group linking I suggested earlier
in the thread should suffice, since this should trigger whenever we have
overlapping non-NUMA sched domains.

Since you have a setup where you can reproduce the issue, could please give
either (ideally both!) a try? Thanks.

> Thanks 
> Zengtao 
> 
>>
>> Morten