[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100528143311.GB9710@elte.hu>
Date: Fri, 28 May 2010 16:33:11 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Borislav Petkov <bp@...en8.de>,
Frederic Weisbecker <fweisbec@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Lin Ming <ming.m.lin@...el.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] perf: Add persistent events
* Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, 2010-05-25 at 09:32 +0200, Borislav Petkov wrote:
> > From: Peter Zijlstra <peterz@...radead.org>
> > Date: Sun, May 23, 2010 at 09:23:21PM +0200
> >
> > > Either we add some notifier thing, or we simply add an explicit call in
> > > the init sequence after the perf_event subsystem is running. I would
> > > suggest we start with some explicit call, and take it from there.
> >
> > Ok, this couldn't be more straightforward. So I looked at the init
> > sequence we do when booting wrt to perf/ftrace initialization:
> >
> > start_kernel
> > ...
> > |-> sched_init
> > |-> perf_event_init
> > ...
> > |-> ftrace_init
> > rest_init
> > kernel_init
> > |-> do_pre_smp_initcalls
> > |...
> > |-> smp_int
> > |-> do_basic_setup
> > |-> do_initcalls
> >
> > and one of the convenient places after both perf is initialized and
> > ftrace has enumerated the tracepoints is do_initcalls() (It cannot be an
> > early_initcall since at that time we're not running SMP yet and we want
> > the MCE event per cpu.)
> >
> > So I added a core_initcall that registers the mce perf event. This makes
> > it more or less a persistent event without any changes to the perf_event
> > subsystem. I guess this should work - at least it builds here, will give
> > it a run later.
> >
> > As a further enhancement, the init-function should read out all the
> > logged mce events which survived the warm reboot and those which happen
> > between mce init and the actual event registration so that perf can
> > postprocess those too at a more convenient time.
>
> Right, so that looks good. Now the interesting part is twofold:
>
> 1) expose these perf_events to userspace, since they're now created
> in kernel, there is no user-space access point to them. One way
> way would be to extend the perf syscall to allow attaching to an
> existing instance (but that would limit us to a single instance per
> 'attr'), or create some /debug or /sys iteration of all such events.
Yeah.
> 2) get these things a buffer, perf_events as created don't actually
> have an output buffer, normally that is created at mmap() time, but
> since you cannot mmap() a kernel side event, it doesn't get to have
> a buffer. This could be done by extracting perf_mmap_data_alloc()
> into a sensible interface.
#2 could be a new syscall: sys_create_ring_buffer or so?
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists