[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160512113359.GO3192@twins.programming.kicks-ass.net>
Date: Thu, 12 May 2016 13:33:59 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Michael Neuling <mikey@...ling.org>
Cc: Matt Fleming <matt@...eblueprint.co.uk>, mingo@...nel.org,
linux-kernel@...r.kernel.org, clm@...com, mgalbraith@...e.de,
tglx@...utronix.de, fweisbec@...il.com, srikar@...ux.vnet.ibm.com,
anton@...ba.org, oliver <oohall@...il.com>,
"Shreyas B. Prabhu" <shreyas@...ux.vnet.ibm.com>
Subject: Re: [RFC][PATCH 4/7] sched: Replace sd_busy/nr_busy_cpus with
sched_domain_shared
On Thu, May 12, 2016 at 09:07:52PM +1000, Michael Neuling wrote:
> On Thu, 2016-05-12 at 07:07 +0200, Peter Zijlstra wrote:
> > But as per the above, Power7 and Power8 have explicit logic to share the
> > per-core L3 with the other cores.
> >
> > How effective is that? From some of the slides/documents i've looked at
> > the L3s are connected with a high-speed fabric. Suggesting that the
> > cross-core sharing should be fairly efficient.
>
> I'm not sure. I thought it was mostly private but if another core was
> sleeping or not experiencing much cache pressure, another core could use it
> for some things. But I'm fuzzy on the the exact properties, sorry.
Right; I'm going by bits and pieces found on the tubes, so I'm just
guessing ;-)
But it sounds like these L3s are nowhere close to what Intel does with
their L3, where each core has an L3 slice, and slices are connected on a
ring to form a unified/shared cache across all cores.
http://www.realworldtech.com/sandy-bridge/8/
> > In which case it would make sense to treat/model the combined L3 as a
> > single large LLC covering all cores.
>
> Are you thinking it would be much cheaper to migrate a task to another core
> inside this chip, than to off chip?
Basically; and if so, if its cheap enough to shoot a task to an idle
core to avoid queueing. Assuming there still is some cache residency on
the old core, the inter-core fill should be much cheaper than fetching
it off package (either remote cache or dram).
Or at least; so goes my reasoning based on my google results.
Powered by blists - more mailing lists