lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 15 Jul 2011 10:37:35 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Anton Blanchard <anton@...ba.org>
Cc:	mahesh@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
	linuxppc-dev@...ts.ozlabs.org, mingo@...e.hu,
	benh@...nel.crashing.org, torvalds@...ux-foundation.org
Subject: Re: [regression] 3.0-rc boot failure -- bisected to cd4ea6ae3982

On Fri, 2011-07-15 at 10:45 +1000, Anton Blanchard wrote:
> Hi,
> 
> > Urgh.. so those spans are generated by sched_domain_node_span(), and
> > it looks like that simply picks the 15 nearest nodes to the one we've
> > got without consideration for overlap with previously generated spans.
> 
> I do wonder if we need this extra level at all on ppc64. From memory
> SGI added it for their massive setups, but our largest setup is 32 nodes
> and breaking that down into 16 node chunks seems overkill.
> 
> I just realised we were setting NEWIDLE on our node definition and that
> was causing large amounts of rebalance work even with
> SD_NODES_PER_DOMAIN=16.
> 
> After removing it and bumping SD_NODES_PER_DOMAIN to 32, things look
> pretty good.
> 
> Perhaps we should allow an arch to override SD_NODES_PER_DOMAIN so this
> extra level is only used by SGI boxes.

We can certainly remove the whole topology layer that causes this
problem for 3.0 and try to fix up for 3.1 again.

But I was rather hoping to introduce more of those layers in the near
future, I was hoping to create a layer per node_distance() value, such
that the load-balancing is aware of the interconnects.

Now for that I ran into the exact same problem, and at the time didn't
come up with a solution, but I think I now see a way out.

Something like the below ought to avoid the problem.. makes SGI sad
though :-)

---
 kernel/sched.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 8fb4245..877b9f1 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -7203,7 +7203,7 @@ static struct sched_domain_topology_level default_topology[] = {
 #endif
 	{ sd_init_CPU, cpu_cpu_mask, },
 #ifdef CONFIG_NUMA
-	{ sd_init_NODE, cpu_node_mask, },
+//	{ sd_init_NODE, cpu_node_mask, },
 	{ sd_init_ALLNODES, cpu_allnodes_mask, },
 #endif
 	{ NULL, },

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ