netdev - Re: [RFC bpf-next 08/16] bpf: Use delayed link free in bpf_link

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAEf4BzZ9zwA=SrLTx9JT50OeM6fVPg0Py0Gx+K9ah2we8YtCRA@mail.gmail.com>
Date:   Fri, 23 Oct 2020 12:46:15 -0700
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Jiri Olsa <jolsa@...nel.org>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andriin@...com>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...omium.org>, Daniel Xu <dxu@...uu.xyz>,
        Steven Rostedt <rostedt@...dmis.org>,
        Jesper Brouer <jbrouer@...hat.com>,
        Toke Høiland-Jørgensen <toke@...hat.com>,
        Viktor Malik <vmalik@...hat.com>
Subject: Re: [RFC bpf-next 08/16] bpf: Use delayed link free in bpf_link_put

On Thu, Oct 22, 2020 at 8:01 AM Jiri Olsa <jolsa@...nel.org> wrote:
>
> Moving bpf_link_free call into delayed processing so we don't
> need to wait for it when releasing the link.
>
> For example bpf_tracing_link_release could take considerable
> amount of time in bpf_trampoline_put function due to
> synchronize_rcu_tasks call.
>
> It speeds up bpftrace release time in following example:
>
> Before:
>
>  Performance counter stats for './src/bpftrace -ve kfunc:__x64_sys_s*
>     { printf("test\n"); } i:ms:10 { printf("exit\n"); exit();}' (5 runs):
>
>      3,290,457,628      cycles:k                                 ( +-  0.27% )
>        933,581,973      cycles:u                                 ( +-  0.20% )
>
>              50.25 +- 4.79 seconds time elapsed  ( +-  9.53% )
>
> After:
>
>  Performance counter stats for './src/bpftrace -ve kfunc:__x64_sys_s*
>     { printf("test\n"); } i:ms:10 { printf("exit\n"); exit();}' (5 runs):
>
>      2,535,458,767      cycles:k                                 ( +-  0.55% )
>        940,046,382      cycles:u                                 ( +-  0.27% )
>
>              33.60 +- 3.27 seconds time elapsed  ( +-  9.73% )
>
> Signed-off-by: Jiri Olsa <jolsa@...nel.org>
> ---
>  kernel/bpf/syscall.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 1110ecd7d1f3..61ef29f9177d 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -2346,12 +2346,8 @@ void bpf_link_put(struct bpf_link *link)
>         if (!atomic64_dec_and_test(&link->refcnt))
>                 return;
>
> -       if (in_atomic()) {
> -               INIT_WORK(&link->work, bpf_link_put_deferred);
> -               schedule_work(&link->work);
> -       } else {
> -               bpf_link_free(link);
> -       }
> +       INIT_WORK(&link->work, bpf_link_put_deferred);
> +       schedule_work(&link->work);

We just recently reverted this exact change. Doing this makes it
non-deterministic from user-space POV when the BPF program is
**actually** detached. This makes user-space programming much more
complicated and unpredictable. So please don't do this. Let's find
some other way to speed this up.

>  }
>
>  static int bpf_link_release(struct inode *inode, struct file *filp)
> --
> 2.26.2
>