linux-kernel - Re: [PATCH] perf: fix pmu::filter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160704180534.GD9048@leverpostej>
Date:	Mon, 4 Jul 2016 19:05:35 +0100
From:	Mark Rutland <mark.rutland@....com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Will Deacon <will.deacon@....com>
Subject: Re: [PATCH] perf: fix pmu::filter_match for SW-led groups

On Sat, Jul 02, 2016 at 06:40:25PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 14, 2016 at 04:10:41PM +0100, Mark Rutland wrote:
> > However, pmu::filter_match is only called for the leader of each event
> > group. When the leader is a SW event, we do not filter the groups, and
> > may fail at pmu::add time, and when this happens we'll give up on
> > scheduling any event groups later in the list until they are rotated
> > ahead of the failing group.
> 
> Ha! indeed.
> 
> > I've tried to find a better way of handling this (without needing to walk the
> > siblings list), but so far I'm at a loss. At least it's "only" O(n) in the size
> > of the sibling list we were going to walk anyway.
> > 
> > I suspect that at a more fundamental level, I need to stop sharing a
> > perf_hw_context between HW PMUs (i.e. replace task_struct::perf_event_ctxp with
> > something that can handle multiple HW PMUs). From previous attempts I'm not
> > sure if that's going to be possible.
> > 
> > Any ideas appreciated!
> 
> So I think I have half-cooked ideas.
> 
> One of the problems I've been wanting to solve for a long time is that
> the per-cpu flexible list has priority over the per-task flexible list.
> 
> I would like them to rotate together.

Makes sense.

> One of the ways I was looking at getting that done is a virtual runtime
> scheduler (just like cfs). The tricky point is merging two virtual
> runtime trees. But I think that should be doable if we sort the trees on
> lag.
> 
> In any case, the relevance to your question is that once we have a tree,
> we can play games with order; that is, if we first order on PMU-id and
> only second on lag, we get whole subtree clusters specific for a PMU.

Hmm... I'm not sure how that helps in this case. Wouldn't we still need
to walk the sibling list to get the HW PMU-id in the case of a SW group
leader?

For the heterogeenous case we'd need a different sort order per-cpu
(well, per microarchitecture), which sounds like we're going to have to
fully sort the events every time they move between CPUs. :/

> Lost of details missing in that picture, but I think something along
> those lines might get us what we want.

Perhaps! Hopefully I'm just missing those detail above. :)

I also had another though about solving the SW-led group case: if the
leader had a reference to the group's HW PMU (of which there should only
be one), we can filter on that alone, and can also use that in
group_sched_in rather than the ctx->pmu, avoiding the issue that
ctx->pmu is not the same as the group's HW PMU.

I'll have a play with that approach in the mean time.

Thanks,
Mark.