[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4Bza6-5QzArHgq9Uh24mR1C+ARDnnfw78q4CSm1=Rb3qOOQ@mail.gmail.com>
Date: Mon, 2 Mar 2020 15:37:32 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: Andrii Nakryiko <andriin@...com>, bpf <bpf@...r.kernel.org>,
Networking <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next 1/3] bpf: introduce pinnable bpf_link abstraction
On Mon, Mar 2, 2020 at 1:40 PM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@...il.com> writes:
>
> > On Mon, Mar 2, 2020 at 2:13 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
> >>
> >> Andrii Nakryiko <andriin@...com> writes:
> >>
> >> > Introduce bpf_link abstraction, representing an attachment of BPF program to
> >> > a BPF hook point (e.g., tracepoint, perf event, etc). bpf_link encapsulates
> >> > ownership of attached BPF program, reference counting of a link itself, when
> >> > reference from multiple anonymous inodes, as well as ensures that release
> >> > callback will be called from a process context, so that users can safely take
> >> > mutex locks and sleep.
> >> >
> >> > Additionally, with a new abstraction it's now possible to generalize pinning
> >> > of a link object in BPF FS, allowing to explicitly prevent BPF program
> >> > detachment on process exit by pinning it in a BPF FS and let it open from
> >> > independent other process to keep working with it.
> >> >
> >> > Convert two existing bpf_link-like objects (raw tracepoint and tracing BPF
> >> > program attachments) into utilizing bpf_link framework, making them pinnable
> >> > in BPF FS. More FD-based bpf_links will be added in follow up patches.
> >> >
> >> > Signed-off-by: Andrii Nakryiko <andriin@...com>
> >> > ---
> >> > include/linux/bpf.h | 13 +++
> >> > kernel/bpf/inode.c | 42 ++++++++-
> >> > kernel/bpf/syscall.c | 209 ++++++++++++++++++++++++++++++++++++-------
> >> > 3 files changed, 226 insertions(+), 38 deletions(-)
> >> >
[...]
> >> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> >> > index c536c65256ad..fca8de7e7872 100644
> >> > --- a/kernel/bpf/syscall.c
> >> > +++ b/kernel/bpf/syscall.c
> >> > @@ -2173,23 +2173,153 @@ static int bpf_obj_get(const union bpf_attr *attr)
> >> > attr->file_flags);
> >> > }
> >> >
> >> > -static int bpf_tracing_prog_release(struct inode *inode, struct file *filp)
> >> > +struct bpf_link {
> >> > + atomic64_t refcnt;
> >>
> >> refcount_t ?
> >
> > Both bpf_map and bpf_prog stick to atomic64 for their refcounting, so
> > I'd like to stay consistent and use refcount that can't possible leak
> > resources (which refcount_t can, if it's overflown).
>
> refcount_t is specifically supposed to turn a possible use-after-free on
> under/overflow into a warning, isn't it? Not going to insist or anything
> here, just found it odd that you'd prefer the other...
Well, underflow is a huge bug that should never happen in well-tested
code (at least that's assumption for bpf_map and bpf_prog), and we are
generally very careful about that. Overflow can happen only because
refcount_t is using 32-bit integer, which atomic64_t side-steps
completely by going to 64-bit integer. So yeah, I'd rather stick to
the same stuff that's used for bpf_map and bpf_prog.
>
> -Toke
>
Powered by blists - more mailing lists