linux-kernel - Re: [PATCH 05/27] perf record, bpf: Parse and create probe points for BPF programs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 17 Sep 2015 10:04:33 +0800
From:	"Wangnan (F)" <wangnan0@...wei.com>
To:	Arnaldo Carvalho de Melo <acme@...hat.com>
CC:	<ast@...mgrid.com>, <masami.hiramatsu.pt@...achi.com>,
	<namhyung@...nel.org>, <a.p.zijlstra@...llo.nl>,
	<brendan.d.gregg@...il.com>, <daniel@...earbox.net>,
	<dsahern@...il.com>, <hekuang@...wei.com>, <jolsa@...nel.org>,
	<lizefan@...wei.com>, <paulus@...ba.org>, <xiakaixu@...wei.com>,
	<pi3orama@....com>, <linux-kernel@...r.kernel.org>,
	<acme@...nel.org>
Subject: Re: [PATCH 05/27] perf record, bpf: Parse and create probe points
 for BPF programs



On 2015/9/17 5:43, Arnaldo Carvalho de Melo wrote:
> Em Sun, Sep 06, 2015 at 07:13:21AM +0000, Wang Nan escreveu:
>> This patch introduces bpf__{un,}probe() functions to enable callers to
>> create kprobe points based on section names of BPF programs. It parses
> Ok, so now I see that when we do:
>
> 	perf record --event bpf_prog.o usleep 1
>
> We are potentially inserting multiple events, one for each eBPF section
> found in bpf_prog.o, right?

Yes.

> I.e. multiple evsels, so, when parsing, we should create the kprobes and
> from there then create evsels and insert in the normal list that would
> eventually get spliced to the evlist being processed, I think.
>
> So, and reading that comment on 4/27, we need to allow that to happen at
> parse_events() time, i.e. avoid adding a dummy evsel that would then,
> later, be "expanded" into potentially multiple non-dummy evsels.
>
> I put the first 3 patches, with some adjustments, in my perf/ebpf
> branch, will try to do the above, i.e. do away with that dummy, allow
> parsing the bpf_prog.o sections at parse_events() time.

Then I suggest a small interface modification in libbpf to return 
'struct bpf_object *',
then after loading we can iterator on each program in that object, creat 
kprobe points
then made evsels. We need to call convert_perf_probe_events() multiple 
times for each
object. Since Namhyung updates the probing interface, I think it becomes 
possible.

I'm glad to see you will working on this problem. If you have other 
ugrent business
to do, I think I can also start this work from Sept. 21 and will have 3 
days before
my vacation.

Thank you.

> - Arnaldo
>
>> the section names of each eBPF program and creates corresponding 'struct
>> perf_probe_event' structures. The parse_perf_probe_command() function is
>> used to do the main parsing work.
>>
>> Parsing result is stored into an array to satisify
>> {convert,apply}_perf_probe_events(). It accepts an array of
>> 'struct perf_probe_event' and do all the work in one call.
>>
>> Define PERF_BPF_PROBE_GROUP as "perf_bpf_probe", which will be used as
>> the group name of all eBPF probing points.
>>
>> probe_conf.max_probes is set to MAX_PROBES to support glob matching.
>>
>> Before ending of bpf__probe(), data in each 'struct perf_probe_event' is
>> cleaned. Things will be changed by following patches because they need
>> 'struct probe_trace_event' in them,
>>
>> Signed-off-by: Wang Nan <wangnan0@...wei.com>
>> Cc: Alexei Starovoitov <ast@...mgrid.com>
>> Cc: Brendan Gregg <brendan.d.gregg@...il.com>
>> Cc: Daniel Borkmann <daniel@...earbox.net>
>> Cc: David Ahern <dsahern@...il.com>
>> Cc: He Kuang <hekuang@...wei.com>
>> Cc: Jiri Olsa <jolsa@...nel.org>
>> Cc: Kaixu Xia <xiakaixu@...wei.com>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
>> Cc: Namhyung Kim <namhyung@...nel.org>
>> Cc: Paul Mackerras <paulus@...ba.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
>> Cc: Zefan Li <lizefan@...wei.com>
>> Cc: pi3orama@....com
>> Link: http://lkml.kernel.org/n/1436445342-1402-21-git-send-email-wangnan0@huawei.com
>> Link: http://lkml.kernel.org/n/1436445342-1402-23-git-send-email-wangnan0@huawei.com
>> [Merged by two patches
>>   wangnan: Utilize new perf probe API {convert,apply,cleanup}_perf_probe_events()
>> ]
>> Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com>
>> ---
>>   tools/perf/builtin-record.c  |  19 +++++-
>>   tools/perf/util/bpf-loader.c | 149 +++++++++++++++++++++++++++++++++++++++++++
>>   tools/perf/util/bpf-loader.h |  13 ++++
>>   3 files changed, 180 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index f886706..b56109f 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -1141,7 +1141,23 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
>>   	if (err)
>>   		goto out_bpf_clear;
>>   
>> -	err = -ENOMEM;
>> +	/*
>> +	 * bpf__probe must be called before symbol__init() because we
>> +	 * need init_symbol_maps. If called after symbol__init,
>> +	 * symbol_conf.sort_by_name won't take effect.
>> +	 *
>> +	 * bpf__unprobe() is safe even if bpf__probe() failed, and it
>> +	 * also calls symbol__init. Therefore, goto out_symbol_exit
>> +	 * is safe when probe failed.
>> +	 */
>> +	err = bpf__probe();
>> +	if (err) {
>> +		bpf__strerror_probe(err, errbuf, sizeof(errbuf));
>> +
>> +		pr_err("Probing at events in BPF object failed.\n");
>> +		pr_err("\t%s\n", errbuf);
>> +		goto out_symbol_exit;
>> +	}
>>   
>>   	symbol__init(NULL);
>>   
>> @@ -1202,6 +1218,7 @@ out_symbol_exit:
>>   	perf_evlist__delete(rec->evlist);
>>   	symbol__exit();
>>   	auxtrace_record__free(rec->itr);
>> +	bpf__unprobe();
>>   out_bpf_clear:
>>   	bpf__clear();
>>   	return err;
>> diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
>> index 88531ea..10505cb 100644
>> --- a/tools/perf/util/bpf-loader.c
>> +++ b/tools/perf/util/bpf-loader.c
>> @@ -9,6 +9,8 @@
>>   #include "perf.h"
>>   #include "debug.h"
>>   #include "bpf-loader.h"
>> +#include "probe-event.h"
>> +#include "probe-finder.h"
>>   
>>   #define DEFINE_PRINT_FN(name, level) \
>>   static int libbpf_##name(const char *fmt, ...)	\
>> @@ -28,6 +30,58 @@ DEFINE_PRINT_FN(debug, 1)
>>   
>>   static bool libbpf_initialized;
>>   
>> +static int
>> +config_bpf_program(struct bpf_program *prog, struct perf_probe_event *pev)
>> +{
>> +	const char *config_str;
>> +	int err;
>> +
>> +	config_str = bpf_program__title(prog, false);
>> +	if (!config_str) {
>> +		pr_debug("bpf: unable to get title for program\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	pr_debug("bpf: config program '%s'\n", config_str);
>> +	err = parse_perf_probe_command(config_str, pev);
>> +	if (err < 0) {
>> +		pr_debug("bpf: '%s' is not a valid config string\n",
>> +			 config_str);
>> +		/* parse failed, don't need clear pev. */
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (pev->group && strcmp(pev->group, PERF_BPF_PROBE_GROUP)) {
>> +		pr_debug("bpf: '%s': group for event is set and not '%s'.\n",
>> +			 config_str, PERF_BPF_PROBE_GROUP);
>> +		err = -EINVAL;
>> +		goto errout;
>> +	} else if (!pev->group)
>> +		pev->group = strdup(PERF_BPF_PROBE_GROUP);
>> +
>> +	if (!pev->group) {
>> +		pr_debug("bpf: strdup failed\n");
>> +		err = -ENOMEM;
>> +		goto errout;
>> +	}
>> +
>> +	if (!pev->event) {
>> +		pr_debug("bpf: '%s': event name is missing\n",
>> +			 config_str);
>> +		err = -EINVAL;
>> +		goto errout;
>> +	}
>> +
>> +	pr_debug("bpf: config '%s' is ok\n", config_str);
>> +
>> +	return 0;
>> +
>> +errout:
>> +	if (pev)
>> +		clear_perf_probe_event(pev);
>> +	return err;
>> +}
>> +
>>   int bpf__prepare_load(const char *filename)
>>   {
>>   	struct bpf_object *obj;
>> @@ -59,6 +113,90 @@ void bpf__clear(void)
>>   		bpf_object__close(obj);
>>   }
>>   
>> +static bool is_probed;
>> +
>> +int bpf__unprobe(void)
>> +{
>> +	struct strfilter *delfilter;
>> +	int ret;
>> +
>> +	if (!is_probed)
>> +		return 0;
>> +
>> +	delfilter = strfilter__new(PERF_BPF_PROBE_GROUP ":*", NULL);
>> +	if (!delfilter) {
>> +		pr_debug("Failed to create delfilter when unprobing\n");
>> +		return -ENOMEM;
>> +	}
>> +
>> +	ret = del_perf_probe_events(delfilter);
>> +	strfilter__delete(delfilter);
>> +	if (ret < 0 && is_probed)
>> +		pr_debug("Error: failed to delete events: %s\n",
>> +			 strerror(-ret));
>> +	else
>> +		is_probed = false;
>> +	return ret < 0 ? ret : 0;
>> +}
>> +
>> +int bpf__probe(void)
>> +{
>> +	int err, nr_events = 0;
>> +	struct bpf_object *obj, *tmp;
>> +	struct bpf_program *prog;
>> +	struct perf_probe_event *pevs;
>> +
>> +	pevs = calloc(MAX_PROBES, sizeof(pevs[0]));
>> +	if (!pevs)
>> +		return -ENOMEM;
>> +
>> +	bpf_object__for_each_safe(obj, tmp) {
>> +		bpf_object__for_each_program(prog, obj) {
>> +			err = config_bpf_program(prog, &pevs[nr_events++]);
>> +			if (err < 0)
>> +				goto out;
>> +
>> +			if (nr_events >= MAX_PROBES) {
>> +				pr_debug("Too many (more than %d) events\n",
>> +					 MAX_PROBES);
>> +				err = -ERANGE;
>> +				goto out;
>> +			};
>> +		}
>> +	}
>> +
>> +	if (!nr_events) {
>> +		/*
>> +		 * Don't call following code to prevent perf report failure
>> +		 * init_symbol_maps can fail when perf is started by non-root
>> +		 * user, which prevent non-root user track normal events even
>> +		 * without using BPF, because bpf__probe() is called by
>> +		 * 'perf record' unconditionally.
>> +		 */
>> +		err = 0;
>> +		goto out;
>> +	}
>> +
>> +	probe_conf.max_probes = MAX_PROBES;
>> +	/* Let convert_perf_probe_events generates probe_trace_event (tevs) */
>> +	err = convert_perf_probe_events(pevs, nr_events);
>> +	if (err < 0) {
>> +		pr_debug("bpf_probe: failed to convert perf probe events");
>> +		goto out;
>> +	}
>> +
>> +	err = apply_perf_probe_events(pevs, nr_events);
>> +	if (err < 0)
>> +		pr_debug("bpf probe: failed to probe events\n");
>> +	else
>> +		is_probed = true;
>> +out_cleanup:
>> +	cleanup_perf_probe_events(pevs, nr_events);
>> +out:
>> +	free(pevs);
>> +	return err < 0 ? err : 0;
>> +}
>> +
>>   #define bpf__strerror_head(err, buf, size) \
>>   	char sbuf[STRERR_BUFSIZE], *emsg;\
>>   	if (!size)\
>> @@ -90,3 +228,14 @@ int bpf__strerror_prepare_load(const char *filename, int err,
>>   	bpf__strerror_end(buf, size);
>>   	return 0;
>>   }
>> +
>> +int bpf__strerror_probe(int err, char *buf, size_t size)
>> +{
>> +	bpf__strerror_head(err, buf, size);
>> +	bpf__strerror_entry(ERANGE, "Too many (more than %d) events",
>> +			    MAX_PROBES);
>> +	bpf__strerror_entry(ENOENT, "Selected kprobe point doesn't exist.");
>> +	bpf__strerror_entry(EEXIST, "Selected kprobe point already exist, try perf probe -d '*'.");
>> +	bpf__strerror_end(buf, size);
>> +	return 0;
>> +}
>> diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
>> index 12be630..6b09a85 100644
>> --- a/tools/perf/util/bpf-loader.h
>> +++ b/tools/perf/util/bpf-loader.h
>> @@ -9,10 +9,15 @@
>>   #include <string.h>
>>   #include "debug.h"
>>   
>> +#define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
>> +
>>   #ifdef HAVE_LIBBPF_SUPPORT
>>   int bpf__prepare_load(const char *filename);
>>   int bpf__strerror_prepare_load(const char *filename, int err,
>>   			       char *buf, size_t size);
>> +int bpf__probe(void);
>> +int bpf__unprobe(void);
>> +int bpf__strerror_probe(int err, char *buf, size_t size);
>>   
>>   void bpf__clear(void);
>>   #else
>> @@ -22,6 +27,8 @@ static inline int bpf__prepare_load(const char *filename __maybe_unused)
>>   	return -1;
>>   }
>>   
>> +static inline int bpf__probe(void) { return 0; }
>> +static inline int bpf__unprobe(void) { return 0; }
>>   static inline void bpf__clear(void) { }
>>   
>>   static inline int
>> @@ -43,5 +50,11 @@ bpf__strerror_prepare_load(const char *filename __maybe_unused,
>>   {
>>   	return __bpf_strerror(buf, size);
>>   }
>> +
>> +static inline int bpf__strerror_probe(int err __maybe_unused,
>> +				      char *buf, size_t size)
>> +{
>> +	return __bpf_strerror(buf, size);
>> +}
>>   #endif
>>   #endif
>> -- 
>> 2.1.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/