lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1401188155.5134.125.camel@marge.simpson.net>
Date:	Tue, 27 May 2014 12:55:55 +0200
From:	Mike Galbraith <umgwanakikbuti@...il.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Libo Chen <libo.chen@...wei.com>, tglx@...utronix.de,
	mingo@...e.hu, LKML <linux-kernel@...r.kernel.org>,
	Greg KH <gregkh@...uxfoundation.org>,
	Li Zefan <lizefan@...wei.com>
Subject: Re: balance storm

On Tue, 2014-05-27 at 12:43 +0200, Peter Zijlstra wrote: 
> On Tue, May 27, 2014 at 12:05:33PM +0200, Mike Galbraith wrote:
> > On Tue, 2014-05-27 at 11:48 +0200, Peter Zijlstra wrote:
> > 
> > > So I suppose this is due to the select_idle_sibling() nonsense again,
> > > where we assumes L3 is a fair compromise between cheap enough and
> > > effective enough.
> > 
> > Nodz.
> > 
> > > Of course, Intel keeps growing the cpu count covered by L3 to ridiculous
> > > sizes, 8 cores isn't nowhere near their top silly, which shifts the
> > > balance, and there's always going to be pathological cases (like the
> > > proposed workload) where its just always going to suck eggs.
> > 
> > Test is as pathological as it gets.  15 core + SMT wouldn't be pretty.
> 
> So one thing we could maybe do is measure the cost of
> select_idle_sibling(), just like we do for idle_balance() and compare
> this against the tasks avg runtime.
> 
> We can go all crazy and do reduced searches; like test every n-th cpu in
> the mask, or make it statistical and do a full search ever n wakeups.
> 
> Not sure what's a good approach. But L3 spanning more and more CPUs is
> not something that's going to get cured anytime soon I'm afraid.
> 
> Not to mention bloody SMT which makes the whole mess worse.

I think we should keep it dirt simple and above all dirt cheap.  The per
task migration cap per unit time should meet that bill, limit the damage
potential, while also limiting the good, but that's tough.  I don't see
any way to make it perfect, so I'll settle for good enough.

-Mike


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ