[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzaSC7gWJSX+s0eudy=AMm_KHd3V92Ps4YMWGOmj-VOG4g@mail.gmail.com>
Date: Fri, 22 Oct 2021 10:01:34 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Tejun Heo <tj@...nel.org>, Song Liu <songliubraving@...com>
Cc: Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
bpf <bpf@...r.kernel.org>, Kernel Team <kernel-team@...com>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] bpf: Implement prealloc for task_local_storage
On Wed, Oct 20, 2021 at 1:16 PM Tejun Heo <tj@...nel.org> wrote:
>
> task_local_storage currently does not support pre-allocation and the memory
> is allocated on demand using the GFP_ATOMIC mask. While atomic allocations
> succeed most of the time and the occasional failures aren't a problem for
> many use-cases, there are some which can benefit from reliable allocations -
> e.g. tracking acquisitions and releases of specific resources to root cause
> long-term reference leaks.
>
> This patchset implements prealloc support for task_local_storage so that
> once a map is created, it's guaranteed that the storage area is always
> available for all tasks unless explicitly deleted.
Song, Martin, can you please take a look at this? It might be
worthwhile to consider having pre-allocated local storage for all
supported types: socket, cgroup, task. Especially for cases where BPF
app is going to touch all or almost all system entities (sockets,
cgroups, tasks, respectively).
Song, in ced47e30ab8b ("bpf: runqslower: Use task local storage") you
did some benchmarking of task-local storage vs hashmap and it was
faster in all cases but the first allocation of task-local storage. It
would be curious to see how numbers change if task-local storage is
pre-allocated, if you get a chance to benchmark it with Tejun's
changes. Thanks!
>
> The only tricky part is syncronizing against the fork path. Fortunately,
> cgroup needs to do the same thing and is already using
> cgroup_threadgroup_rwsem and we can use the same mechanism without adding
> extra overhead. This patchset generalizes the rwsem and make cgroup and bpf
> select it.
>
> This patchset is on top of bpf-next 223f903e9c83 ("bpf: Rename BTF_KIND_TAG
> to BTF_KIND_DECL_TAG") and contains the following patches:
>
> 0001-cgroup-Drop-cgroup_-prefix-from-cgroup_threadgroup_r.patch
> 0002-sched-cgroup-Generalize-threadgroup_rwsem.patch
> 0003-bpf-Implement-prealloc-for-task_local_storage.patch
>
> and also available in the following git branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git bpf-task-local-storage-prealloc
>
> diffstat follows. Thanks.
>
> fs/exec.c | 7 +
> include/linux/bpf.h | 6 +
> include/linux/bpf_local_storage.h | 12 +++
> include/linux/cgroup-defs.h | 33 ---------
> include/linux/cgroup.h | 11 +--
> include/linux/sched/threadgroup_rwsem.h | 46 ++++++++++++
> init/Kconfig | 4 +
> kernel/bpf/Kconfig | 1
> kernel/bpf/bpf_local_storage.c | 112 ++++++++++++++++++++++--------
> kernel/bpf/bpf_task_storage.c | 138 +++++++++++++++++++++++++++++++++++++-
> kernel/cgroup/cgroup-internal.h | 4 -
> kernel/cgroup/cgroup-v1.c | 9 +-
> kernel/cgroup/cgroup.c | 74 ++++++++++++--------
> kernel/cgroup/pids.c | 2
> kernel/fork.c | 16 ++++
> kernel/sched/core.c | 4 +
> kernel/sched/sched.h | 1
> kernel/signal.c | 7 +
> tools/testing/selftests/bpf/prog_tests/task_local_storage.c | 101 +++++++++++++++++++++++++++
> tools/testing/selftests/bpf/progs/task_ls_prealloc.c | 15 ++++
> 20 files changed, 489 insertions(+), 114 deletions(-)
>
> --
> tejun
>
Powered by blists - more mailing lists