linux-kernel - Re: balance storm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140527094802.GN30445@twins.programming.kicks-ass.net>
Date:	Tue, 27 May 2014 11:48:02 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Libo Chen <libo.chen@...wei.com>
Cc:	Mike Galbraith <umgwanakikbuti@...il.com>, tglx@...utronix.de,
	mingo@...e.hu, LKML <linux-kernel@...r.kernel.org>,
	Greg KH <gregkh@...uxfoundation.org>,
	Li Zefan <lizefan@...wei.com>
Subject: Re: balance storm

On Mon, May 26, 2014 at 07:49:10PM +0800, Libo Chen wrote:
> On 2014/5/26 15:56, Mike Galbraith wrote:
> > On Mon, 2014-05-26 at 11:04 +0800, Libo Chen wrote: 
> >> hi,
> >>     my box has 16 cpu (E5-2658,8 core, 2 thread per core), i did a test on
> >> 3.4.24stable, startup 50 same process, every process is sample:
> >>
> >>  	#include <unistd.h>
> >>
> >>  	int main()
> >>  	{
> >>           	for (;;)
> >>           	{
> >>                   	unsigned int i = 0;
> >>                  	 while (i< 100){
> >>                      	 i++;
> >>                   	}
> >>                   	usleep(100);
> >>           	}
> >>
> >>          	 return 0;
> >>   	}
> >>
> >> the result is process uses 15% cpu time, perf tool shows 70w migrations in 5 second.
> > 
> > My 8 socket 64 core DL980 running 256 copies (3.14-rt5) munches ~4%/copy
> > per top, and does roughly 1 sh*tload migrations, nano-work loop or not.
> > Turn SD_SHARE_PKG_RESOURCES off at MC (not a noop here), and consumption
> > drops to ~2%/copy, and migrations ('course) mostly go away.

So: 

1) what kind of weird ass workload is that? Why are you waking up so
often to do no work?

2) turning on/off share_pkg_resource is a horrid hack whichever way
aruond you turn it.

So I suppose this is due to the select_idle_sibling() nonsense again,
where we assumes L3 is a fair compromise between cheap enough and
effective enough.

Of course, Intel keeps growing the cpu count covered by L3 to ridiculous
sizes, 8 cores isn't nowhere near their top silly, which shifts the
balance, and there's always going to be pathological cases (like the
proposed workload) where its just always going to suck eggs.

Also, when running 50 such things on a 16 cpu machine, you get roughly 3
per cpu, since their runtime is stupid low, I would expect it to pretty
much always hit an idle cpu, which in turn should inhibit the migration.

Then again, maybe the timer slack is causing you grief, resulting in all
3 being woken at the same time, instead of having them staggered.

In any case, I'm not sure what the 'regression' report is against, as
there's only a single kernel version mentioned: 3.4, and that's almost a
dinosaur.

Content of type "application/pgp-signature" skipped