[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4b5045b-ebbd-a5f4-b96c-0352a5e8bb23@fb.com>
Date: Fri, 18 Oct 2019 22:09:30 +0000
From: Yonghong Song <yhs@...com>
To: John Fastabend <john.fastabend@...il.com>,
Daniel Borkmann <daniel@...earbox.net>
CC: Andrii Nakryiko <andriin@...com>,
"ast@...nel.org" <ast@...nel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>
Subject: Re: [bpf-next PATCH] bpf: libbpf, support older style kprobe load
On 10/18/19 3:01 PM, John Fastabend wrote:
> Daniel Borkmann wrote:
>> On Fri, Oct 18, 2019 at 07:54:26AM -0700, John Fastabend wrote:
>>> Following ./Documentation/trace/kprobetrace.rst add support for loading
>>> kprobes programs on older kernels.
>>>
>>> Signed-off-by: John Fastabend <john.fastabend@...il.com>
>>> ---
>>> tools/lib/bpf/libbpf.c | 81 +++++++++++++++++++++++++++++++++++++++++++-----
>>> 1 file changed, 73 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>>> index fcea6988f962..12b3105d112c 100644
>>> --- a/tools/lib/bpf/libbpf.c
>>> +++ b/tools/lib/bpf/libbpf.c
>>> @@ -5005,20 +5005,89 @@ static int determine_uprobe_retprobe_bit(void)
>>> return parse_uint_from_file(file, "config:%d\n");
>>> }
>>>
>>> +static int use_kprobe_debugfs(const char *name,
>>> + uint64_t offset, int pid, bool retprobe)
>>
>> offset & pid unused?
>
> Well pid should be dropped and I'll add support for offset. I've not
> being using offset so missed it here.
>
>>
>>> +{
>>> + const char *file = "/sys/kernel/debug/tracing/kprobe_events";
>>> + int fd = open(file, O_WRONLY | O_APPEND, 0);
>>> + char buf[PATH_MAX];
>>> + int err;
>>> +
>>> + if (fd < 0) {
>>> + pr_warning("failed open kprobe_events: %s\n",
>>> + strerror(errno));
>>> + return -errno;
>>> + }
>>> +
>>> + snprintf(buf, sizeof(buf), "%c:kprobes/%s %s",
>>> + retprobe ? 'r' : 'p', name, name);
>>> +
>>> + err = write(fd, buf, strlen(buf));
>>> + close(fd);
>>> + if (err < 0)
>>> + return -errno;
>>> + return 0;
>>> +}
>>> +
>>> static int perf_event_open_probe(bool uprobe, bool retprobe, const char *name,
>>> uint64_t offset, int pid)
>>> {
>>> struct perf_event_attr attr = {};
>>> char errmsg[STRERR_BUFSIZE];
>>> + uint64_t config1 = 0;
>>> int type, pfd, err;
>>>
>>> type = uprobe ? determine_uprobe_perf_type()
>>> : determine_kprobe_perf_type();
>>> if (type < 0) {
>>> - pr_warning("failed to determine %s perf type: %s\n",
>>> - uprobe ? "uprobe" : "kprobe",
>>> - libbpf_strerror_r(type, errmsg, sizeof(errmsg)));
>>> - return type;
>>> + if (uprobe) {
>>> + pr_warning("failed to determine uprobe perf type %s: %s\n",
>>> + name,
>>> + libbpf_strerror_r(type,
>>> + errmsg, sizeof(errmsg)));
>>> + } else {
>>> + /* If we do not have an event_source/../kprobes then we
>>> + * can try to use kprobe-base event tracing, for details
>>> + * see ./Documentation/trace/kprobetrace.rst
>>> + */
>>> + const char *file = "/sys/kernel/debug/tracing/events/kprobes/";
>>> + char c[PATH_MAX];
>>> + int fd, n;
>>> +
>>> + snprintf(c, sizeof(c), "%s/%s/id", file, name);
>>> +
>>> + err = use_kprobe_debugfs(name, offset, pid, retprobe);
>>> + if (err)
>>> + return err;
>>
>> Should we throw a pr_warning() here as well when bailing out?
>
> Sure makes sense.
>
>>
>>> + type = PERF_TYPE_TRACEPOINT;
>>> + fd = open(c, O_RDONLY, 0);
>>> + if (fd < 0) {
>>> + pr_warning("failed to open tracepoint %s: %s\n",
>>> + c, strerror(errno));
>>> + return -errno;
>>> + }
>>> + n = read(fd, c, sizeof(c));
>>> + close(fd);
>>> + if (n < 0) {
>>> + pr_warning("failed to read %s: %s\n",
>>> + c, strerror(errno));
>>> + return -errno;
>>> + }
>>> + c[n] = '\0';
>>> + config1 = strtol(c, NULL, 0);
>>> + attr.size = sizeof(attr);
>>> + attr.type = type;
>>> + attr.config = config1;
>>> + attr.sample_period = 1;
>>> + attr.wakeup_events = 1;
>>
>> Is there a reason you set latter two whereas below they are not set,
>> does it not default to these?
>
> We can drop this.
Agreed. attr.sample_period and attr.wakeup_events are used for perf.
Here we run bpf programs and return early, so these two values won't
take effect.
>
>>
>>> + }
>>> + } else {
>>> + config1 = ptr_to_u64(name);
>>> + attr.size = sizeof(attr);
>>> + attr.type = type;
>>> + attr.config1 = config1; /* kprobe_func or uprobe_path */
>>> + attr.config2 = offset; /* kprobe_addr or probe_offset */
>>> }
>>> if (retprobe) {
>>> int bit = uprobe ? determine_uprobe_retprobe_bit()
>>> @@ -5033,10 +5102,6 @@ static int perf_event_open_probe(bool uprobe, bool retprobe, const char *name,
>>> }
>>> attr.config |= 1 << bit;
>>> }
>>
>> What happens in case of retprobe, don't you (unwantedly) bail out here
>> again (even through you've set up the retprobe earlier)?
>
> Will fix as well. Looking to see how it passed on my box here but either
> way its cryptic so will clean up.
>
>>
>>> - attr.size = sizeof(attr);
>>> - attr.type = type;
>>> - attr.config1 = ptr_to_u64(name); /* kprobe_func or uprobe_path */
>>> - attr.config2 = offset; /* kprobe_addr or probe_offset */
>>>
>>> /* pid filter is meaningful only for uprobes */
>>> pfd = syscall(__NR_perf_event_open, &attr,
>>>
>
>
Powered by blists - more mailing lists