[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101015141816.GA2822@redhat.com>
Date: Fri, 15 Oct 2010 10:18:16 -0400
From: Jason Baron <jbaron@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...e.hu>,
Frederic Weisbecker <fweisbec@...il.com>,
linux-kernel@...r.kernel.org, David Miller <davem@...emloft.net>,
Mike Galbraith <efault@....de>,
Steven Rostedt <rostedt@...dmis.org>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [RFC][PATCH 7/7] perf: Optimize sw events
On Fri, Oct 15, 2010 at 11:14:03AM +0200, Peter Zijlstra wrote:
> > Index: linux-2.6/include/linux/perf_event.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/perf_event.h
> > +++ linux-2.6/include/linux/perf_event.h
> > @@ -1013,15 +1013,17 @@ static inline void perf_fetch_caller_reg
> > static inline void
> > perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
> > {
> > - if (atomic_read(&perf_swevent_enabled[event_id])) {
> > - struct pt_regs hot_regs;
> > + struct pt_regs hot_regs;
> >
> > - if (!regs) {
> > - perf_fetch_caller_regs(&hot_regs);
> > - regs = &hot_regs;
> > - }
> > - __perf_sw_event(event_id, nr, nmi, regs, addr);
> > + JUMP_LABEL(&perf_swevent_enabled[event_id], have_event);
> > + return;
> > +
> > +have_event:
> > + if (!regs) {
> > + perf_fetch_caller_regs(&hot_regs);
> > + regs = &hot_regs;
> > }
> > + __perf_sw_event(event_id, nr, nmi, regs, addr);
> > }
> >
>
> OK, so it appears I only compile tested this bit without jump_label
> support, with that bit added back this horribly fails to compile like:
>
> In file included from /usr/src/linux-2.6/arch/x86/mm/fault.c:13:
> /usr/src/linux-2.6/include/linux/perf_event.h: In function ‘perf_sw_event.clone.0’:
> /usr/src/linux-2.6/include/linux/perf_event.h:1018: warning: asm operand 0 probably doesn’t match constraints
> /usr/src/linux-2.6/include/linux/perf_event.h:1018: error: impossible constraint in ‘asm’
>
>
> The relevant snippet from the .i file reads:
>
> static inline void
> perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
> {
> struct pt_regs hot_regs;
>
> do { asm goto("1:" ".byte 0xe9 \n\t .long 0\n\t" ".pushsection __jump_table, \"a\" \n\t" " " ".quad" " " "1b, %l[" "have_event" "], %c0 \n\t" ".popsection \n\t" : : "i" (&perf_swevent_enabled[event_id]) : : have_event); } while (0);
> return;
>
> have_event:
> if (!regs) {
> perf_fetch_caller_regs(&hot_regs);
> regs = &hot_regs;
> }
> __perf_sw_event(event_id, nr, nmi, regs, addr);
> }
>
>
> Anybody got any clue as to why this goes splat?
>
Couple issues here. The key value: "&perf_swevent_enabled[event_id]"
is determined at run-time, so we can't associated the jump label with
keys that are run-time dependent. The second thing is that there is
array indexing, so essentially the jump label code would need to be
smart enough to associated the correct index with each instance of
'perf_sw_event' - unfortunately its not that smart :(
So, two possible suggestions:
1)
I see that PERF_COUNT_SW_MAX is 9. So, we could explicitly split out the
9 cases into 9 separate inline functions where the keys would be:
&perf_swevent_enabled[PERF_COUNT_SW_CPU_CLOCK]
&perf_swevent_enabled[PERF_COUNT_SW_TASK_CLOCK]
this probably wouldn't look to ugly if there was a macro that took each
software id, and spit out the appropriate inline. I see there aren't
that many 'perf_sw_event()' calls either to update.
2)
If you could define a single global variable that was set if any of the
indexes of the perf_swevent_enabled[] array were set. This potentially
woulnd't be as efficient as #1 as it would create false positives.
thanks,
-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists