linux-kernel - Re: [PATCH 0/18] sched: simplified fork, enable load average into LB and power awareness scheduling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20121213113549.GB31485@liondog.tnic>
Date:	Thu, 13 Dec 2012 12:35:49 +0100
From:	Borislav Petkov <bp@...en8.de>
To:	Alex Shi <alex.shi@...el.com>
Cc:	Arjan van de Ven <arjan@...ux.intel.com>,
	Alex Shi <lkml.alex@...il.com>, rob@...dley.net,
	mingo@...hat.com, peterz@...radead.org, gregkh@...uxfoundation.org,
	andre.przywara@....com, rjw@...k.pl, paul.gortmaker@...driver.com,
	akpm@...ux-foundation.org, paulmck@...ux.vnet.ibm.com,
	linux-kernel@...r.kernel.org, pjt@...gle.com,
	vincent.guittot@...aro.org,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>
Subject: Re: [PATCH 0/18] sched: simplified fork, enable load average into LB
 and power awareness scheduling

On Thu, Dec 13, 2012 at 11:07:43AM +0800, Alex Shi wrote:
> >> now, on the other hand, if you have two threads of a process that
> >> share a bunch of data structures, and you'd spread these over 2
> >> sockets, you end up bouncing data between the two sockets a lot,
> >> running inefficient --> bad for power.
> >
> > Yeah, that should be addressed by the NUMA patches people are
> > working on right now.
>
> Yes, as to balance/powersaving policy, we can tight pack tasks
> firstly, then NUMA balancing will make memory follow us.
>
> BTW, NUMA balancing is more related with page in memory. not LLC.

Sure, let's look at the worst and best cases:

* worst case: you have memory shared by multiple threads on one node
*and* working set doesn't fit in LLC.

Here, if you pack threads tightly only on one node, you still suffer the
working set kicking out parts of itself out of LLC.

If you spread threads around, you still cannot avoid the LLC thrashing
because the LLC of the node containing the shared memory needs to cache
all those transactions. *In* *addition*, you get the cross-node traffic
because the shared pages are on the first node.

Major suckage.

Does it matter? I don't know. It can be decided on a case-by-case basis.
If people care about singlethread perf, they would likely want to spread
around and buy in the cross-node traffic.

If they care for power, then maybe they don't want to turn on the second
socket yet.

* the optimal case is where memory follows threads and gets spread
around such that LLC doesn't get thrashed and cross-node traffic gets
avoided.

Now, you can think of all those other scenarios in between :-/

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/