[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzYBS8+9XADjJdvKB=_6tf8_t19UVGfm2Lk1+Nb6qWk5cw@mail.gmail.com>
Date: Wed, 6 Oct 2021 15:37:16 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Jiri Olsa <jolsa@...hat.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andriin@...com>,
"Steven Rostedt (VMware)" <rostedt@...dmis.org>,
Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...omium.org>, Daniel Xu <dxu@...uu.xyz>,
Viktor Malik <vmalik@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: [RFC] store function address in BTF
On Wed, Oct 6, 2021 at 3:05 PM Jiri Olsa <jolsa@...hat.com> wrote:
>
> On Wed, Oct 06, 2021 at 02:22:28PM -0700, Andrii Nakryiko wrote:
> > On Wed, Oct 6, 2021 at 1:06 PM Jiri Olsa <jolsa@...hat.com> wrote:
> > >
> > > On Wed, Oct 06, 2021 at 09:17:39AM -0700, Andrii Nakryiko wrote:
> > > > On Wed, Oct 6, 2021 at 1:42 AM Jiri Olsa <jolsa@...hat.com> wrote:
> > > > >
> > > > > hi,
> > > > > I'm hitting performance issue and soft lock ups with the new version
> > > > > of the patchset and the reason seems to be kallsyms lookup that we
> > > > > need to do for each btf id we want to attach
> > > > >
> > > > > I tried to change kallsyms_lookup_name linear search into rbtree search,
> > > > > but it has its own pitfalls like duplicate function names and it still
> > > > > seems not to be fast enough when you want to attach like 30k functions
> > > >
> > > > How not fast enough is it exactly? How long does it take?
> > >
> > > 30k functions takes 75 seconds for me, it's loop calling bpf_check_attach_target
> > >
> > > getting soft lock up messages:
> > >
> > > krava33 login: [ 168.896671] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [bpftrace:1087]
> > >
> >
> > That's without RB tree right? I was curious about the case of you
> > converting kallsyms to RB tree and it still being slow. Can't imagine
> > 30k queries against RB tree with ~160k kallsyms taking 75 seconds.
>
> yep, that's the standard kallsyms lookup api
>
> I need to make some adjustment for rbtree kalsyms code, I think I found
> a bug in there, so the numbers are probably better as you suggest
ok, cool, let's see what are the new numbers then
>
> >
> > But as I said, why not map BTF IDs into function names, sort function
> > names, and then pass over kallsyms once, doing binary search into a
> > sorted array of requested function names and then recording addr for
> > each. Then check that you found addresses for all functions (it also
> > leaves a question of what to do when we have multiple matching
> > functions, but it's a problem with any approach). If everything checks
> > out, you have a nice btf id -> func name -> func addr mapping. It's
> > O(N log(M)), which sounds like it shouldn't be slow. Definitely not
> > multiple seconds slow.
>
> ok, now that's clear to me, thanks for these details
great
>
> >
> >
> > >
> > > >
> > > > >
> > > > > so I wonder we could 'fix this' by storing function address in BTF,
> > > > > which would cut kallsyms lookup completely, because it'd be done in
> > > > > compile time
> > > > >
> > > > > my first thought was to add extra BTF section for that, after discussion
> > > > > with Arnaldo perhaps we could be able to store extra 8 bytes after
> > > > > BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> > > > > indicate that? or new BTF_KIND_FUNC2 type?
> > > > >
> > > > > thoughts?
> > > >
> > > > I'm strongly against this, because (besides the BTF bloat reason) we
> > > > need similar mass attachment functionality for kprobe/kretprobe and
> > > > that one won't be relying on BTF FUNCs, so I think it's better to
> > > > stick to the same mechanism for figuring out the address of the
> > > > function.
> > >
> > > ok
> > >
> > > >
> > > > If RB tree is not feasible, we can do a linear search over unsorted
> > > > kallsyms and do binary search over sorted function names (derived from
> > > > BTF IDs). That would be O(Nlog(M)), where N is number of ksyms, M is
> > > > number of BTF IDs/functions-to-be-attached-to. If we did have an RB
> > > > tree for kallsyms (is it hard to support duplicates? why?) it could be
> > > > even faster O(Mlog(N)).
> > >
> > > I had issues with generic kallsyms rbtree in the post some time ago,
> > > I'll revisit it to check on details.. but having the tree with just
> > > btf id functions might clear that.. I'll check
> >
> > That's not what I'm proposing. See above. Please let me know if
> > something is not clear before going all in for RB tree implementation
> > :)
> >
> >
> > But while we are on topic, do you think (with ftrace changes you are
> > doing) it would be hard to support multi-attach for
> > kprobes/kretprobes? We now have bpf_link interface for attaching
> > kprobes, so API can be pretty aligned with fentry/fexit, except
> > instead of btf IDs we'd need to pass array of pointers of C strings, I
> > suppose.
>
> hum, I think kprobe/kretprobe is made of perf event (kprobe/kretprobe
> pmus), then you pass event fd and program fd to bpf link syscall,
> and it attaches bpf program to that perf event
>
> so perhaps the user interface would be array of perf events fds and prog fd
>
> also I think you can have just one probe for function, so we will not need
> to share kprobes for multiple users like we need for trampolines, so the
> attach logic will be simple
creating thousands of perf_event FDs seems expensive (and you'll be
running into the limit of open files pretty soon in a lot of systems).
So I think for multi-attach we'll have to have a separate way where
you'd specify kernel function name (and maybe also offset). I'm just
saying that having BPF_LINK_CREATE command, it's easier (probably) to
extend this for kprobe multi-attach, than trying to retrofit this into
perf_event_open.
>
> jirka
>
> >
> > >
> > > thanks,
> > > jirka
> > >
> > > >
> > > >
> > > > >
> > > > > thanks,
> > > > > jirka
> > > > >
> > > >
> > >
> >
>
Powered by blists - more mailing lists