linux-kernel - Re: [PATCH 0/18] sched: simplified fork, enable load average into LB and power awareness scheduling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <50CA8746.5050604@intel.com>
Date:	Fri, 14 Dec 2012 09:56:22 +0800
From:	Alex Shi <alex.shi@...el.com>
To:	Borislav Petkov <bp@...en8.de>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	Alex Shi <lkml.alex@...il.com>, rob@...dley.net,
	mingo@...hat.com, peterz@...radead.org, gregkh@...uxfoundation.org,
	andre.przywara@....com, rjw@...k.pl, paul.gortmaker@...driver.com,
	akpm@...ux-foundation.org, paulmck@...ux.vnet.ibm.com,
	linux-kernel@...r.kernel.org, pjt@...gle.com,
	vincent.guittot@...aro.org,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>
Subject: Re: [PATCH 0/18] sched: simplified fork, enable load average into
 LB and power awareness scheduling

On 12/13/2012 07:35 PM, Borislav Petkov wrote:
> On Thu, Dec 13, 2012 at 11:07:43AM +0800, Alex Shi wrote:
>>>> now, on the other hand, if you have two threads of a process that
>>>> share a bunch of data structures, and you'd spread these over 2
>>>> sockets, you end up bouncing data between the two sockets a lot,
>>>> running inefficient --> bad for power.
>>>
>>> Yeah, that should be addressed by the NUMA patches people are
>>> working on right now.
>>
>> Yes, as to balance/powersaving policy, we can tight pack tasks
>> firstly, then NUMA balancing will make memory follow us.
>>
>> BTW, NUMA balancing is more related with page in memory. not LLC.
> 
> Sure, let's look at the worst and best cases:
> 
> * worst case: you have memory shared by multiple threads on one node
> *and* working set doesn't fit in LLC.
> 
> Here, if you pack threads tightly only on one node, you still suffer the
> working set kicking out parts of itself out of LLC.
> 
> If you spread threads around, you still cannot avoid the LLC thrashing
> because the LLC of the node containing the shared memory needs to cache
> all those transactions. *In* *addition*, you get the cross-node traffic
> because the shared pages are on the first node.
> 
> Major suckage.
> 
> Does it matter? I don't know. It can be decided on a case-by-case basis.
> If people care about singlethread perf, they would likely want to spread
> around and buy in the cross-node traffic.
> 
> If they care for power, then maybe they don't want to turn on the second
> socket yet.
> 
> * the optimal case is where memory follows threads and gets spread
> around such that LLC doesn't get thrashed and cross-node traffic gets
> avoided.
> 
> Now, you can think of all those other scenarios in between :-/

You are right. thanks for explanation! :)

Actually, what I went to say is that numa balancing target is pages in
different node memory, but of course, it may improve LLC performance.
> 
> Thanks.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/