lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 24 Jan 2013 08:47:03 +0100
From:	Mike Galbraith <bitbucket@...ine.de>
To:	Michael Wang <wangyun@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...hat.com,
	peterz@...radead.org, mingo@...nel.org, a.p.zijlstra@...llo.nl
Subject: Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

On Thu, 2013-01-24 at 15:15 +0800, Michael Wang wrote: 
> On 01/24/2013 02:51 PM, Mike Galbraith wrote:
> > On Thu, 2013-01-24 at 14:01 +0800, Michael Wang wrote:
> > 
> >> I've enabled WAKE flag on my box like you did, but still can't see
> >> regression, and I've just tested on a power server with 64 cpu, also
> >> failed to reproduce the issue (not compared with virgin yet, but can't
> >> see collapse).
> > 
> > I'm not surprised.  I'm seeing enough inconsistent crap to come to the
> > conclusion that stock scheduler knobs flat can't be used on a largish
> > box, they're just too preempt-happy, leading to weird crap.
> > 
> > My 2 missing nodes came back, and the very same kernel that highly
> > repeatably collapsed with 2 nodes does not with 4 nodes, and 2 nodes
> > does not collapse with only preemption knob tweaking, and that's
> > bullshit.  Virgin shows instability in the mid-range, make a tiny tweak
> > that should have little if any effect there, and that instability
> > vanishes entirely.  Test runs are not consistent enough boot to boot etc
> > etc.  Either stock knobs suck on NUMA boxen, or this box is possessed.
> 
> Mike, I wonder the reason why change back to the old way make collapse
> away may not because there are logical error in new balance path, it's
> just changed the cost of select_task_rq(), whatever it's more or less,
> it's accidentally achieve the same effect as you tweak the knob, so
> that's the reason why it looks like old is better than new.

That's what I'm saying, it's a useless crap side-effect of a preempt
happy kernel.  Results with these knobs are just not stable.  Results go
wildly unstable with 2 nodes vs 4 in this box, but can be stabilized in
all with preemption knob adjustment.. or phase of moon might make them
appear stable.. or not.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ