linux-kernel - Re: balance storm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53848B81.4090709@huawei.com>
Date:	Tue, 27 May 2014 20:56:33 +0800
From:	Libo Chen <libo.chen@...wei.com>
To:	Mike Galbraith <umgwanakikbuti@...il.com>,
	Peter Zijlstra <peterz@...radead.org>
CC:	<tglx@...utronix.de>, <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Greg KH <gregkh@...uxfoundation.org>,
	"Li Zefan" <lizefan@...wei.com>
Subject: Re: balance storm

On 2014/5/27 18:55, Mike Galbraith wrote:
> On Tue, 2014-05-27 at 12:43 +0200, Peter Zijlstra wrote: 
>> On Tue, May 27, 2014 at 12:05:33PM +0200, Mike Galbraith wrote:
>>> On Tue, 2014-05-27 at 11:48 +0200, Peter Zijlstra wrote:
>>>
>>>> So I suppose this is due to the select_idle_sibling() nonsense again,
>>>> where we assumes L3 is a fair compromise between cheap enough and
>>>> effective enough.
>>>
>>> Nodz.
>>>
>>>> Of course, Intel keeps growing the cpu count covered by L3 to ridiculous
>>>> sizes, 8 cores isn't nowhere near their top silly, which shifts the
>>>> balance, and there's always going to be pathological cases (like the
>>>> proposed workload) where its just always going to suck eggs.
>>>
>>> Test is as pathological as it gets.  15 core + SMT wouldn't be pretty.
>>
>> So one thing we could maybe do is measure the cost of
>> select_idle_sibling(), just like we do for idle_balance() and compare
>> this against the tasks avg runtime.
>>
>> We can go all crazy and do reduced searches; like test every n-th cpu in
>> the mask, or make it statistical and do a full search ever n wakeups.
>>
>> Not sure what's a good approach. But L3 spanning more and more CPUs is
>> not something that's going to get cured anytime soon I'm afraid.
>>
>> Not to mention bloody SMT which makes the whole mess worse.
> 
> I think we should keep it dirt simple and above all dirt cheap.  The per
> task migration cap per unit time should meet that bill, limit the damage
> potential, while also limiting the good, but that's tough.  I don't see

agree

> any way to make it perfect, so I'll settle for good enough.
> 
> -Mike
> 
> 
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/