linux-kernel - Re: [PATCH] perf/sdt: Directly record cached SDT events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20160503092539.7abc46b56ec0cf216a0c23e1@kernel.org>
Date:	Tue, 3 May 2016 09:25:39 +0900
From:	Masami Hiramatsu <mhiramat@...nel.org>
To:	Brendan Gregg <brendan.d.gregg@...il.com>
Cc:	Hemant Kumar <hemant@...ux.vnet.ibm.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Ingo Molnar <mingo@...hat.com>, ananth@...ux.vnet.ibm.com
Subject: Re: [PATCH] perf/sdt: Directly record cached SDT events

On Mon, 2 May 2016 11:19:34 -0700
Brendan Gregg <brendan.d.gregg@...il.com> wrote:

> On Fri, Apr 29, 2016 at 6:40 AM, Hemant Kumar <hemant@...ux.vnet.ibm.com> wrote:
> > This patch adds support for directly recording SDT events which are
> > present in the probe cache. This patch is based on current SDT
> > enablement patchset (v5) by Masami :
> > https://lkml.org/lkml/2016/4/27/828
> > and it implements two points in the TODO list mentioned in the
> > cover note :
> > "- (perf record) Support SDT event recording directly"
> > "- (perf record) Try to unregister SDT events after record."
> >
> > Without this patch, we could probe into SDT events using
> > "perf probe" and "perf record". With this patch, we can probe
> > the SDT events directly using "perf record".
> >
> > For example :
> >
> >  # perf list sdt       // List the SDT events
> > ...
> >   sdt_mysql:update__row__done                        [SDT event]
> >   sdt_mysql:update__row__start                       [SDT event]
> >   sdt_mysql:update__start                            [SDT event]
> >   sdt_python:function__entry                         [SDT event]
> >   sdt_python:function__return                        [SDT event]
> >   sdt_test:marker1                                   [SDT event]
> >   sdt_test:marker2                                   [SDT event]
> > ...
> >
> >  # perf record -e %sdt_test:marker1 -e %sdt_test:marker2 -a
> 
> Why do we need the '%'? Can't the "sdt_" prefix be sufficient? ie:
> 
> # perf record -e sdt_test:marker1 -e sdt_test:marker2 -a

For the perf-record side, "sdt_test:marker1" gives just a normal
tracepoint event name (which is common with probe events on
ftrace/perftools). For example, if I add a probe event by perf probe,
it is shown same as other tracepoint events. This means I can make
"sdt_test:marker1" with other address in principle.

----
$ sudo ./perf probe -a "sdt_test:marker1=vmalloc"
Added new event:
  sdt_test:marker1     (on vmalloc)

You can now use it in all perf tools, such as:

	perf record -e sdt_test:marker1 -aR sleep 1
----

So, you can shot you feet, easily:)

One possible solution is reserving "sdt_" prefix for SDT, then
we can avoid using "%" for that.

However, what I intended was more generic solution including probe-cache,
so that user can freely replay on cached probes once the user defines a
probe, even after rebooting the machine. Of course, we can search such
events automatically if a user gives a non-existing event name.

> I find it a bit weird to define it using %sdt_, but then use it using
> sdt_. I'd also be inclined to use it for probe creation, ie:
> 
> # perf probe -x /lib/libc-2.17.so  sdt_libc:lll_lock_wait_private
> 
> That way, the user only learns one way to specify the probe, with the
> sdt_ prefix. It's fine if % existed too, but optional.

OK, if we can see "sdt_" prefix on the first place, we can treat as there
is "%" :)

> > ^C[ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 2.087 MB perf.data (22 samples) ]
> >
> >  # perf script
> >         test_sdt 29230 [002] 405550.548017: sdt_test:marker1: (400534)
> >         test_sdt 29230 [002] 405550.548064: sdt_test:marker2: (40053f)
> >         test_sdt 29231 [002] 405550.962806: sdt_test:marker1: (400534)
> >         test_sdt 29231 [002] 405550.962841: sdt_test:marker2: (40053f)
> >         test_sdt 29232 [001] 405551.379327: sdt_test:marker1: (400534)
> >  ...
> >
> > After invoking "perf record", behind the scenes, it checks whether the
> > event specified is an SDT event using the flag '%'. After that, it
> > does a lookup of the probe cache to find out the SDT event. If its not
> > present, it throws an error. Otherwise, it goes on and writes the event
> > into the uprobe_events file and sets up the probe event, trace events,
> > etc and starts recording. It also maintains a list of the event names
> > that were written to uprobe_events file. After finishing the record
> > session, it removes the events from the uprobe_events file using the
> > maintained name list.
> 
> Does this support semaphore SDT probes (is-enabled)? Those need the
> semaphore incremented when enabled, then decremented when disabled.

No, not actually supported yet. Semaphore and SDT parameters will be
supported afterwards.

Thank you!

-- 
Masami Hiramatsu <mhiramat@...nel.org>