linux-kernel - Re: [PATCH] perf record: Allow passing perf's own pid to '--filter'

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <03223997-411B-4C5E-B03D-080F40EE76D6@163.com>
Date:	Mon, 6 Jul 2015 23:00:10 +0800
From:	pi3orama <pi3orama@....com>
To:	Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:	Wang Nan <wangnan0@...wei.com>,
	"a.p.zijlstra@...llo.nl" <a.p.zijlstra@...llo.nl>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"jolsa@...nel.org" <jolsa@...nel.org>,
	"peterz@...radead.org" <peterz@...radead.org>,
	"namhyung@...nel.org" <namhyung@...nel.org>,
	"kan.liang@...el.com" <kan.liang@...el.com>,
	"adrian.hunter@...el.com" <adrian.hunter@...el.com>,
	"ak@...ux.intel.com" <ak@...ux.intel.com>,
	"cody@...ux.vnet.ibm.com" <cody@...ux.vnet.ibm.com>,
	"jacob.w.shin@...il.com" <jacob.w.shin@...il.com>,
	"standby24x7@...il.com" <standby24x7@...il.com>,
	"lizefan@...wei.com" <lizefan@...wei.com>,
	"yunlong.song@...wei.com" <yunlong.song@...wei.com>,
	"rostedt@...dmis.org" <rostedt@...dmis.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] perf record: Allow passing perf's own pid to '--filter'



发自我的 iPhone

> 在 2015年7月6日，下午9:56，Arnaldo Carvalho de Melo <acme@...nel.org> 写道：
> 
> Em Mon, Jul 06, 2015 at 04:17:31AM +0000, Wang Nan escreveu:
>> This patch allows passing perf's own PID to '--filter' by using
>> '@...FPID'. This should be useful when system-widely capturing
>> tracepoints events.
> 
> Steven, does filters have any special meaning for @?
> 
>> Before this patch, when doing something like:
>> 
>> # perf record -a -e syscalls:sys_enter_write <cmd>
>> 
>> One could easily get result like this:
>> 
>> # /tmp/perf report --stdio
>> ...
>> # Overhead  Command  Shared Object       Symbol
>> # ........  .......  ..................  ....................
>> #
>>     99.99%  perf     libpthread-2.18.so  [.] __write_nocancel
>>     0.01%   ls       libc-2.18.so        [.] write
>>     0.01%   sshd     libc-2.18.so        [.] write
>>     ...
>> 
>> Where most events are generated by perf itself.
>> 
>> A shell trick can be done to filter perf itself out:
>> 
>> # cat << EOF > ./tmp
>>> #!/bin/sh
>>> exec perf record -e ... --filter="common_pid != \$\$" -a sleep 10
>>> EOF
>> # chmod a+x ./tmp
>> # ./tmp
>> 
>> However, doing so is user unfriendly.
>> 
>> This patch introduces '@...FPID' placeholder to '--filter' options. Now
>> user is allowed to the above work with:
>> 
>>  # perf record -e ... --filter="common_pid != @PERFPID' sleep 10
>> 
>> Signed-off-by: Wang Nan <wangnan0@...wei.com>
>> ---
>> tools/perf/Documentation/perf-record.txt |  1 +
> 
> <SNIP>
> 
>> +++ b/tools/perf/util/parse-events.c
>> @@ -1175,6 +1175,101 @@ int parse_events_option(const struct option *opt, const char *str,
>>    return ret;
>> }
>> 
>> +#ifndef PAGE_SIZE
>> +# define PAGE_SIZE 4096
>> +#endif
> 
> You can use 'page_size', its available and filled via:
> 
>   page_size = sysconf(_SC_PAGE_SIZE);
> 
> early in perf's main() routine.
> 
>> +static int
>> +postproc_filter_append_token(const char *key, char *new_filter,
>> +                 ssize_t *pspace)
>> +{
>> +    if (strcmp(key, "PERFPID") == 0) {
>> +        char pid_buf[32];
>> +        pid_t self_pid = getpid();
>> +
>> +        snprintf(pid_buf, 32, "%d", self_pid);
> 
>        snprintf(pid_buf, sizeof(pid_buf), "%d", self_pid);
> 
>> +        strncat(new_filter, pid_buf, *pspace);
>> +        *pspace -= strlen(pid_buf);
>> +        if (*pspace < 0)
>> +            return -1;
>> +        return 0;
>> +    }
>> +
>> +    return -1;
> 
> but then, please take a look at my perf/core branch, by coincidence I
> worked on having multiple filters set on a evsel in 'perf trace', where
> at a minimum, the tools' pid is added to the filter for all tracepoints
> used, i.e.:
> 
>     commom_pid != getpid()
> 
> is always present, to avoid a feedback loop, neverending tracing of
> syscalls generated by the tracer itself.
> 
> Then, if you use --filter-pids PID-1,PID-2,PID3, it will create an
> expression with that first part, for things like gnome-terminal, xorg,
> etc.
> 
> Now we need to keep that in place and if the user uses -e to specify
> which syscalls it wants (or wants filtered out), we need to again
> concatenate with that commom_pid list, so that we call the filter ioctl
> just once, else the kernel returns EEXIST.
> 
> Because I needed to append, etc, there are new functions there for
> go on creating such expressions, please use them, its all in my
> perf/core branch, the latest patches.
> 
> I.e. having something in the filter expression that gets transformed
> into the tools' pid, I have no problem with that, just curious about
> what would be the best character to signal that a substitution needs to
> be performed, if it is really '@...', as my first selection would be
> '$VAR',

$ has special meaning for shell. Using $ in cmdline require users use escaping or '' quoted string. Therefore I believe @ should be better. What do you think?

> but then I haven't looked deeply at ftrace's filter stuff to see
> if it has provision for substitution in the kernel, etc.
> 
> Andi also did this at some point, forgot why that wasn't applied at the
> time :-\
> 
> For reference:
> 
> https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=f47805a2af3ba83881ca52434bbbc6e9886b72fd
> 

OK. I'll rebase this patch on it.

Thank you.

> - Arnaldo
> 
>> +}
>> +
>> +static void postproc_filter(struct perf_evsel *evsel)
>> +{
>> +    char *at = NULL, *sep = NULL, *old_filter, *new_filter;
>> +    ssize_t space;
>> +
>> +    if (!evsel->filter)
>> +        return;
>> +
>> +    old_filter = evsel->filter;
>> +    at = strchr(old_filter, '@');
>> +    if (!at)
>> +        return;
>> +
>> +    /*
>> +     * See perf_event_set_filter(). Length of a valid filter is
>> +     * limited by PAGE_SIZE.
>> +     */
>> +    new_filter = malloc(PAGE_SIZE);
>> +    if (!new_filter) {
>> +        fprintf(stderr, "No enough memory for post proc filter '%s'\n",
>> +            old_filter);
>> +        return;
>> +    }
>> +    *new_filter = '\0';
>> +    space = PAGE_SIZE - 1;
>> +
>> +    while (1) {
>> +        if (at)
>> +            *at = '\0';
>> +        strncat(new_filter, old_filter, space);
>> +        space -= strlen(old_filter);
>> +        if (space < 0)
>> +            goto errout;
>> +        if (!at)
>> +            break;
>> +        *at = '@';
>> +
>> +        sep = strchr(at + 1, ' ');
>> +        if (sep)
>> +            *sep = '\0';
>> +
>> +        if (postproc_filter_append_token(at + 1, new_filter, &space))
>> +            goto errout;
>> +
>> +        if (!sep)
>> +            break;
>> +        *sep = ' ';
>> +
>> +        old_filter = sep;
>> +        at = strchr(old_filter, '@');
>> +    }
>> +
>> +    free(evsel->filter);
>> +    /*
>> +     * It is safe to use new_filter directly. However, try to
>> +     * release some memory by strdup() a smaller string and free
>> +     * new_filter, which takes a full page.
>> +     */
>> +    evsel->filter = strdup(new_filter);
>> +    if (!evsel->filter)
>> +        evsel->filter = new_filter;
>> +    else
>> +        free(new_filter);
>> +    return;
>> +errout:
>> +    if (at)
>> +        *at = '@';
>> +    if (sep)
>> +        *sep = ' ';
>> +    fprintf(stderr, "Can't post proc filter '%s'\n", evsel->filter);
>> +    free(new_filter);
>> +}
>> +
>> int parse_filter(const struct option *opt, const char *str,
>>         int unset __maybe_unused)
>> {
>> @@ -1196,6 +1291,7 @@ int parse_filter(const struct option *opt, const char *str,
>>        return -1;
>>    }
>> 
>> +    postproc_filter(last);
>>    return 0;
>> }
>> 
>> -- 
>> 1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/