lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Apr 2017 14:17:16 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Lauro Venancio <lvenanci@...hat.com>
Cc:     lwang@...hat.com, riel@...hat.com, Mike Galbraith <efault@....de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/4] sched/topology: the group balance cpu must be a cpu
 where the group is installed

On Mon, Apr 24, 2017 at 12:11:59PM -0300, Lauro Venancio wrote:
> On 04/24/2017 10:03 AM, Peter Zijlstra wrote:
> > On Thu, Apr 20, 2017 at 04:51:43PM -0300, Lauro Ramos Venancio wrote:
> >
> >> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> >> index e77c93a..694e799 100644
> >> --- a/kernel/sched/topology.c
> >> +++ b/kernel/sched/topology.c
> >> @@ -505,7 +507,11 @@ static void build_group_mask(struct sched_domain *sd, struct sched_group *sg)
> >>  
> >>  	for_each_cpu(i, sg_span) {
> >>  		sibling = *per_cpu_ptr(sdd->sd, i);

> >> -		if (!cpumask_test_cpu(i, sched_domain_span(sibling)))

> >> +		if (!cpumask_equal(sg_span, sched_group_cpus(sibling->groups)))
> >>  			continue;

Hmm _this_ is what requires us to move the thing to a whole separate
iteration. Because when we build the groups, the domains are already
constructed, so that was right.

So the moving crud around wasn't the primary fix, this is.

With the fact that sched_group_cpus(sd->groups) ==
sched_domain_span(sibling->child) (if child exists) established in the
previous patches, could we not simplify this like the below?

---
Subject: sched/topology: Fix overlapping sched_group_mask
From: Peter Zijlstra <peterz@...radead.org>
Date: Tue Apr 25 14:00:49 CEST 2017

The point of sched_group_mask is to select those CPUs from
sched_group_cpus that can actually arrive at this balance domain.

The current code gets it wrong, as can be readily demonstrated with a
topology like:

  node   0   1   2   3
    0:  10  20  30  20
    1:  20  10  20  30
    2:  30  20  10  20
    3:  20  30  20  10

Where (for example) domain 1 on CPU1 ends up with a mask that includes
CPU0:

  [] CPU1 attaching sched-domain:
  []  domain 0: span 0-2 level NUMA
  []   groups: 1 (mask: 1), 2, 0
  []   domain 1: span 0-3 level NUMA
  []    groups: 0-2 (mask: 0-2) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)

This causes sched_balance_cpu() to compute the wrong CPU and
consequently should_we_balance() will terminate early resulting in
missed load-balance opportunities.

The fixed topology looks like:

  [] CPU1 attaching sched-domain:
  []  domain 0: span 0-2 level NUMA
  []   groups: 1 (mask: 1), 2, 0
  []   domain 1: span 0-3 level NUMA
  []    groups: 0-2 (mask: 1) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)

Debugged-by: Lauro Ramos Venancio <lvenanci@...hat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
---
 kernel/sched/topology.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -495,6 +495,9 @@ enum s_alloc {
 /*
  * Build an iteration mask that can exclude certain CPUs from the upwards
  * domain traversal.
+ *
+ * Only CPUs that can arrive at this group should be considered to continue
+ * balancing.
  */
 static void build_group_mask(struct sched_domain *sd, struct sched_group *sg)
 {
@@ -505,7 +508,13 @@ static void build_group_mask(struct sche
 
 	for_each_cpu(i, sg_span) {
 		sibling = *per_cpu_ptr(sdd->sd, i);
-		if (!cpumask_test_cpu(i, sched_domain_span(sibling)))
+
+		/* overlap should have children; except for FORCE_SD_OVERLAP */
+		if (WARN_ON_ONCE(!sibling->child))
+			continue;
+
+		/* If we would not end up here, we can't continue from here */
+		if (!cpumask_equal(sg_span, sched_domain_span(sibling->child)))
 			continue;
 
 		cpumask_set_cpu(i, sched_group_mask(sg));

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ