linux-kernel - RE: [PATCH 1/1] perf/core: find auxiliary events in running pmus list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F0770589E368@SHSMSX103.ccr.corp.intel.com>
Date:	Sun, 28 Feb 2016 16:31:49 +0000
From:	"Liang, Kan" <kan.liang@...el.com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"ak@...ux.intel.com" <ak@...ux.intel.com>,
	"eranian@...gle.com" <eranian@...gle.com>,
	"vincent.weaver@...ne.edu" <vincent.weaver@...ne.edu>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"mingo@...nel.org" <mingo@...nel.org>,
	"acme@...hat.com" <acme@...hat.com>,
	"jolsa@...hat.com" <jolsa@...hat.com>,
	"ying.huang@...ux.intel.com" <ying.huang@...ux.intel.com>,
	"alexander.shishkin@...ux.intel.com" 
	<alexander.shishkin@...ux.intel.com>
Subject: RE: [PATCH 1/1] perf/core: find auxiliary events in running pmus
 list



> -----Original Message-----
> From: Peter Zijlstra [mailto:peterz@...radead.org]
> Sent: Friday, February 26, 2016 5:37 AM
> To: Liang, Kan
> Cc: linux-kernel@...r.kernel.org; ak@...ux.intel.com;
> eranian@...gle.com; vincent.weaver@...ne.edu; tglx@...utronix.de;
> mingo@...nel.org; acme@...hat.com; jolsa@...hat.com;
> ying.huang@...ux.intel.com; alexander.shishkin@...ux.intel.com
> Subject: Re: [PATCH 1/1] perf/core: find auxiliary events in running pmus
> list
> 
> > index 9> +static void account_running_pmu(struct perf_event *event)
> > +{
> > +	struct running_pmu *pmu;
> > +
> > +	mutex_lock(&running_pmus_lock);
> > +
> > +	list_for_each_entry(pmu, &running_pmus, entry) {
> > +		if (pmu->pmu == event->pmu) {
> > +			pmu->nr_event++;
> > +			goto out;
> > +		}
> > +	}
> > +
> > +	pmu = kzalloc(sizeof(struct running_pmu), GFP_KERNEL);
> > +	if (pmu != NULL) {
> > +		pmu->nr_event++;
> > +		pmu->pmu = event->pmu;
> > +		list_add_rcu(&pmu->entry, &running_pmus);
> > +	}
> 
> That kzalloc() doesn't make any sense, why not simply add a member to
> struct pmu?
> 
> > +out:
> > +	mutex_unlock(&running_pmus_lock);
> > +}
> 
> In any case, can't we replace the whole perf_event_aux muck with a data
> structure for finding interested events?
> 
> Because not only is iterating all PMUs silly, we then further iterate all
> events in whatever contexts we find from them. Even if none of the
> events in these contexts is interested in the side-band event.
> 
> We have:
> 	mmap,comm,task,mmap_data,mmap2,comm_exec,context_switc
> h
> 
> 
> Which I think we can reduce like:
> 
> 	{mmap,mmap_data,mmap2} -> mmap
> 	{comm,comm_exec} -> comm
> 
> 	mmap,comm,task,context_switch
> 
> This would allow us to keep 4 per-cpu lists of events like:
> 
> 
> struct pmu_event_list {
> 	raw_spinlock_t		lock;
> 	struct list_head	list;
> };
> 
> enum event_sb_channel {
> 	sb_mmap = 0,
> 	sb_comm,
> 	sb_task,
> 	sb_switch,
> 	sb_nr,
> }
> 
> static DEFINE_PER_CPU(struct pmu_event_list, pmu_sb_events[sb_nr]);
> 
> static void attach_sb_event(struct perf_event *event, enum
> event_sb_channel sb) {
> 	struct pmu_event_list *pel = per_cpu_ptr(&pmu_sb_events[sb],
> event->cpu);
> 
> 	raw_spin_lock(&pel->lock);
> 	list_add_rcu(&event->sb_list[sb], &pel->list);
> 	raw_spin_unlock(&pel->lock);
> }
> 
> static void account_pmu_sb_event(struct perf_event *event) {
> 	if (event->parent)
> 		return;
> 
> 	if (event->attach & ATTACH_TASK)
> 		return;
> 
> 	if (event->attr.mmap || event->attr.mmap_data || event-
> >attr.mmap2)
> 		attach_sb_event(event, sb_mmap);
> 
> 	if (event->attr.comm || event->attr.comm_exec)
> 		attach_sb_event(event, sb_comm);
> 
> 	if (event->attr.task)
> 		attach_sb_event(event, sb_task);
> 
> 	if (event->attr.context_switch)
> 		attach_sb_event(event, sb_switch);
> }
> 
> /* matching unaccount */
> 
> 
> static void perf_event_sb_iterate(enum event_sb_channel sb,
> 				  perf_event_sb_output_f output, void
> *data) {
> 	struct pmu_event_list *pel =
> __this_cpu_ptr(&pmu_sb_events[sb]);
> 	struct perf_event *event;
> 
> 	list_for_each_entry_rcu(event, &pel->list, sb_list[sb]) {
> 		if (event->state < PERF_EVENT_STATE_INACTIVE)
> 			continue;
> 		if (!event_filter_match(event))
> 			continue;
> 		output(event, data);
> 	}
> }
> 
> static void perf_event_sb_mask(unsigned int sb_mask,
> perf_event_sb_output_f output, void *data) {
> 	int sb;
> 
> 	for (sb = 0; sb < sb_nr; sb++) {
> 		if (!(sb_mask & (1 << sb)))
> 			continue;
> 		perf_event_sb_iterate(sb, output, data);
> 	}
> }
> 
> Note: the mask is needed because a task event (as per perf_event_task)
> needs to go out to sb_comm, sb_mmap and sb_task, see
> perf_event_task_match().
> 
> And then you can replace the whole global part of perf_event_aux (which I
> would rename to perf_event_sb), with this.
> 
> You still have to do something like:
> 
> 	for_each_task_context_nr(ctxn) {
> 		ctx = rcu_dereference(current->perf_event_ctxp[ctxn]);
> 		if (!ctx)
> 			continue;
> 		perf_event_sb_ctx(ctx, output, data);
> 	}
>

I'm not sure why we need something like for_each_task_context_nr(ctxn) in
perf_event_aux/perf_event_sb.
Isn't perf_event_sb_mask enough?
It looks perf_event_sb_mask will go through all possible interested events
(includes both per cpu and per task events). That's what we want to do in
perf_event_aux, right?
 
Thanks,
Kan

> To get at the per task events, because I don't think we want to go update
> more global state on context switch, nor am I entirely sure its worth it to
> keep per sb ctx->event_list[]s.
> 
>