lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 23 Jul 2015 22:13:22 +0900
From:	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
To:	Hemant Kumar <hemant@...ux.vnet.ibm.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>
CC:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	linux-kernel@...r.kernel.org,
	Adrian Hunter <adrian.hunter@...el.com>,
	Ingo Molnar <mingo@...hat.com>,
	Paul Mackerras <paulus@...ba.org>,
	Jiri Olsa <jolsa@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Borislav Petkov <bp@...e.de>
Subject: Re: Re: [RFC PATCH perf/core v2 00/16] perf-probe --cache and
 SDT support

On 2015/07/22 23:12, Hemant Kumar wrote:
> Hi Masami,
> 
> Apologies for the delayed response.
> 
> On 07/17/2015 08:51 AM, Masami Hiramatsu wrote:
>> Hi Hemant,
>>
>> On 2015/07/16 12:13, Hemant Kumar wrote:
>>> Hi Masami,
>>>
>>> On 07/15/2015 02:43 PM, Masami Hiramatsu wrote:
>>>> Hi,
>>>>
>>>> Here is the 2nd version of the patchset for probe-cache and
>>>> initial SDT support which are going to be perf-cache finally.
>>> Thanks for adding the SDT support.
>>>
>>>> The perf-probe is useful for debugging, but it strongly depends
>>>> on the debuginfo. Without debuginfo, it is just a frontend of
>>>> ftrace's dynamic events. This can usually happen in server
>>>> farms or on cloud system, since no one wants to distribute
>>>> big debuginfo packages.
>>>>
>>>> To solve this issue, I had tried to make a pre-analyzed probes
>>>> ( https://lkml.org/lkml/2014/10/31/207 ) but it has a problm
>>>> that we can't ensure the probed binary is same as what we analyzed.
>>>> Arnaldo gave me an idea to reuse build-id cache for that perpose
>>>> and this series is the first prototype of that.
>>>>
>>>> At the same time, Hemant has started to support SDT probes which
>>>> also use the cache file of SDT info. So I decided to merge this
>>>> into the same build-id cache.
>>>> In this version, SDT support is still very limited, it works
>>>> as a part of probe-cache.
>>>>
>>>> In this version, perf probe supports --cache option which means
>>>> that perf probe manipulate probe caches, for example,
>>>>
>>>>     # perf probe --cache --add "probe-desc"
>>>>
>>>> does not only add probe events but also add "probe-desc" and
>>>> it's result on the cache. (Note that the cached entry is always
>>>> referred even without --cache)
>>>> The --list and --del commands also support --cache. Note that
>>>> both are only manipulate caches, not real events.
>>>>
>>>> To use SDT, we have to scan the target binary at first by using
>>>> perf-buildid-cache, e.g.
>>>>
>>>>     # perf buildid-cache --add /lib/libc-2.17.so
>>>>
>>>> And perf probe --cache --list shows what SDTs are scanned.
>>>>
>>>>     # perf probe --cache --list
>>>>     /usr/lib/libc-2.17.so (a6fb821bdf53660eb2c29f778757aef294d3d392):
>>>>     libc:setjmp=setjmp
>>>>     libc:longjmp=longjmp
>>>>     libc:longjmp_target=longjmp_target
>>>>     libc:memory_heap_new=memory_heap_new
>>>>     libc:memory_sbrk_less=memory_sbrk_less
>>>>     libc:memory_arena_reuse_free_list=memory_arena_reuse_free_list
>>>>     libc:memory_arena_reuse=memory_arena_reuse
>>>>     ...
>>>>
>>>> To use the SDT events, perf probe -x BIN %SDTEVENT allows you to
>>>> add a probe on SDTEVENT@....
>>>>
>>>>     # perf probe -x /lib/libc-2.17.so %memory_heap_new
>>>>
>>>> If you define a cached probe with event name, you can also reuse
>>>> it as same as SDT events.
>>>>
>>>>     # perf probe -x ./perf --cache -n 'myevent=dso__load $params'
>>>>
>>>> (Note that "-n" option only updates caches)
>>>> To use the above "myevent", you just have to add "%myevent".
>>>>
>>>>     # perf probe -x ./perf %myevent
>>>>
>>>>
>>>> TODOs:
>>>>    - Show available cached/SDT events by perf-list
>>>>    - Allow perf-record to use cached/SDT events directly
>>> As I was already working on SDT events' recording
>>> https://lkml.org/lkml/2014/11/2/73,
>>> I can re-spin the patches on top of your patchset and make the
>>> required changes to implement the above TODOs.
>> Sounds great! :)
>> Note that you'll need to re-implement almost from scratch, since
>> now the SDT is implemented on buildid-cache. Maybe I have to work
>> on the buildid-cache one more to filter out binaries which are gone
>> or different version from current running one (e.g. old vmlinux).
>> It could help you to get available SDTs when showing it via perf-list.
> 
> Sure. That would be great.
> 
>>> What would you suggest?
>> Now I'm thinking that we should avoid using %event syntax for perf-list
>> and perf-record to avoid confusion. For example, suppose that we have
>> "libfoo:bar" SDT event, when we just scanned the libfoo binary and
>> use it via perf-record, we'll run perf record -e "%libfoo:bar".
>> However, after we set the probe via perf-probe, we have to run
>> perf record -e "libfoo:bar". That difference looks no good.
>> So, I think in both case it should accept -e "libfoo:bar" syntax.
> 
> Although I agree to have "perf record" as a higher level tool and not bother
> this tool to distinguish between its events, but that way we end up looking
> into kprobe_events, uprobe_events, kernel tracepoints and then the entire
> cache for any event (which may or may not be an SDT event or even a valid
> event) lookup. Right?

Yeah, right.

> 
> The idea behind '%' was to identify the SDT events and take a different path
> to lookup through the cache, put a probe, record and then delete the probe.
> Or, do you want "perf record" to record any event this way (not just an sdt
> event).

I see, but I think that is not good by following reasons,

- when we record event with "-e %provider:event", it will be shown as
  "provider:event"
- if perf-list shows the SDT(cached) events as "%provider:event", that
  will not match the recorded result.
- it is somewhat fragile that we temporary add the SDT event and remove it
  after record, because the event will not hide from ftrace users (this
  means that we'll fail removing the event by -EBUSY if someone use it
  via ftrace)
- if we set SDT events perf-probe, it will be shown as "provider:event" name
  because "%" will be rejected by ftrace. In that case, what the perf-list show
  those events, both of %provider:event and provider:event ?

thus I pushed the "%" as a "special remembering mark" only for looking
up the event from cache by perf-probe.

So I'd like to suggest that the following behavior

1) perf-list shows the cached-with-name and SDT events as Tracepoint events
  even if it is not yet probed.

# perf list

List of pre-defined events (to be used in -e):
...
  libc:memory_heap_new                             [Tracepoint event]
...
  probes:myevent                                   [Tracepoint event]
...

2) perf-record -e with no-probed event should try to set up the given probe
 by using perf-probe. It is possible to remove that the probe after recording,
 but also ignore if it fails by -EBUSY. (anyway, there is no difference for
 users)

This rule will solve the contradiction between the event name on recorded
data and listed events. However, as we discussed there are other clashes.

A) clash among binaries: Since the binary builders can freely use the
provider name, it is possible to clash to other binaries' SDTs.

B) clash among different versions: Of course the different versions of binaries
can be co-exist on the system. Those usually have the same SDTs and same
basename, just different build-ids.

These issues are not solved by using "%" because it happens among SDTs.
So we need to find another way to distinguish the SDTs.

Thank you,

> 
> Please correct me if I missed something.
> 
>> In this series I've introduced %event syntax only to recall cached event
>> setting explicitly, because perf-probe is a lower layer tool to set up
>> new event. IMO, perf-list and perf-record should be higher tools which
>> handle abstract events.
>>
>> Thanks!
>>
>>
> 


-- 
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: masami.hiramatsu.pt@...achi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ