[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160705094447.GA20478@leverpostej>
Date: Tue, 5 Jul 2016 10:44:48 +0100
From: Mark Rutland <mark.rutland@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
Will Deacon <will.deacon@....com>
Subject: Re: [PATCH] perf: fix pmu::filter_match for SW-led groups
On Tue, Jul 05, 2016 at 10:35:26AM +0200, Peter Zijlstra wrote:
> On Mon, Jul 04, 2016 at 07:05:35PM +0100, Mark Rutland wrote:
> > On Sat, Jul 02, 2016 at 06:40:25PM +0200, Peter Zijlstra wrote:
> > > One of the ways I was looking at getting that done is a virtual runtime
> > > scheduler (just like cfs). The tricky point is merging two virtual
> > > runtime trees. But I think that should be doable if we sort the trees on
> > > lag.
> > >
> > > In any case, the relevance to your question is that once we have a tree,
> > > we can play games with order; that is, if we first order on PMU-id and
> > > only second on lag, we get whole subtree clusters specific for a PMU.
> >
> > Hmm... I'm not sure how that helps in this case. Wouldn't we still need
> > to walk the sibling list to get the HW PMU-id in the case of a SW group
> > leader?
>
> Since there is a hardware even in the group, it must be part of the
> hardware pmu list/tree and would thus end up classified (and sorted) by
> that (hardware) PMU-id.
>
> > For the heterogeenous case we'd need a different sort order per-cpu
> > (well, per microarchitecture), which sounds like we're going to have to
> > fully sort the events every time they move between CPUs. :/
>
> Confused, I thought that for the HG case you had multiple events, one
> for each PMU. If we classify these events differently we'd simply use a
> different subtree depending on which CPU the task lands.
My bad; I assumed that for both PMUs we'd start at the root, and thus
would need to re-sort in order to get the current CPU's PMU ordered
first, much like currently with rotation.
I guess I'm having difficulty figuring out the structure of that tree.
If we can easily/cheaply find the relevant sub-tree then the above isn't
an issue.
> Currently we've munged the two PMUs together, because, well, that's the
> only way.
Yeah. Splitting them by any means would be great. In the past I'd looked
at changing task_struct::perf_event_ctxp into something that could
handle an arbitrary number of contexts, such that we could avoid
sharing, but ran away after considering the locking/rcu implications.
Thanks,
Mark.
Powered by blists - more mailing lists