lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 13 May 2022 00:16:28 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>, Hao Luo <haoluo@...gle.com>,
        Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Shuah Khan <shuah@...nel.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Michal Hocko <mhocko@...nel.org>
Cc:     Stanislav Fomichev <sdf@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Greg Thelen <gthelen@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        cgroups@...r.kernel.org
Subject: Re: [RFC PATCH bpf-next 0/9] bpf: cgroup hierarchical stats collection

I have done some significant changes on the BPF side of this. I will
send a RFC V2 soon with those changes and incorporating the feedback
on the cgroup side that I got from Tejun. Hold off on reviewing this
version.


On Mon, May 9, 2022 at 5:18 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
>
> This patch series allows for using bpf to collect hierarchical cgroup
> stats efficiently by integrating with the rstat framework. The rstat
> framework provides an efficient way to collect cgroup stats and
> propagate them through the cgroup hierarchy.
>
> The last patch is a selftest that demonastrates the entire workflow.
> The workflow consists of:
> - bpf programs that collect per-cpu per-cgroup stats (tracing progs).
> - bpf rstat flusher that contains the logic for aggregating stats
>   across cpus and across the cgroup hierarchy.
> - bpf cgroup_iter responsible for outputting the stats to userspace
>   through reading a file in bpffs.
>
> The first 3 patches include the new bpf rstat flusher program type and
> the needed support in rstat code and libbpf. The rstat flusher program
> is a callback that the rstat framework makes to bpf when a stat flush is
> ongoing, similar to the css_rstat_flush() callback that rstat makes to
> cgroup controllers. Each callback is parameterized by a (cgroup, cpu)
> pair that has been updated. The program contains the logic for
> aggregating the stats across cpus and across the cgroup hierarchy.
> These programs can be attached to any cgroup subsystem, not only the
> ones that implement the css_rstat_flush() callback in the kernel. This
> gives bpf programs more flexibility, and more isolation from the kernel
> implementation.
>
> The following 2 patches add necessary helpers for the stats collection
> workflow. Helpers that call into cgroup_rstat_updated() and
> cgroup_rstat_flush() are added to allow bpf programs collecting stats to
> tell the rstat framework that a cgroup has been updated, and to allow
> bpf programs outputting stats to tell the rstat framework to flush the
> stats before they are displayed to the user. An additional
> bpf_map_lookup_percpu_elem is introduced to allow rstat flusher programs
> to access percpu stats of the cpu being flushed.
>
> The following 3 patches add the cgroup_iter program type (v2). This was
> originally introduced by Hao as a part of a different series [1].
> Their usecase is better showcased as part of this patch series. We also
> make cgroup_get_from_id() cgroup v1 friendly to allow cgroup_iter programs
> to display stats for cgroup v1 as well. This small change makes the
> entire workflow cgroup v1 friendly without any other dedicated changes.
>
> The final patch is a selftest demonstrating the entire workflow with a
> set of bpf programs that collect per-cgroup latency of memcg reclaim.
>
> [1]https://lore.kernel.org/lkml/20220225234339.2386398-9-haoluo@google.com/
>
>
> Hao Luo (2):
>   cgroup: Add cgroup_put() in !CONFIG_CGROUPS case
>   bpf: Introduce cgroup iter
>
> Yosry Ahmed (7):
>   bpf: introduce CGROUP_SUBSYS_RSTAT program type
>   cgroup: bpf: flush bpf stats on rstat flush
>   libbpf: Add support for rstat progs and links
>   bpf: add bpf rstat helpers
>   bpf: add bpf_map_lookup_percpu_elem() helper
>   cgroup: add v1 support to cgroup_get_from_id()
>   bpf: add a selftest for cgroup hierarchical stats collection
>
>  include/linux/bpf-cgroup-subsys.h             |  35 ++
>  include/linux/bpf.h                           |   4 +
>  include/linux/bpf_types.h                     |   2 +
>  include/linux/cgroup-defs.h                   |   4 +
>  include/linux/cgroup.h                        |   5 +
>  include/uapi/linux/bpf.h                      |  45 +++
>  kernel/bpf/Makefile                           |   3 +-
>  kernel/bpf/arraymap.c                         |  11 +-
>  kernel/bpf/cgroup_iter.c                      | 148 ++++++++
>  kernel/bpf/cgroup_subsys.c                    | 212 +++++++++++
>  kernel/bpf/hashtab.c                          |  25 +-
>  kernel/bpf/helpers.c                          |  56 +++
>  kernel/bpf/syscall.c                          |   6 +
>  kernel/bpf/verifier.c                         |   6 +
>  kernel/cgroup/cgroup.c                        |  16 +-
>  kernel/cgroup/rstat.c                         |  11 +
>  scripts/bpf_doc.py                            |   2 +
>  tools/include/uapi/linux/bpf.h                |  45 +++
>  tools/lib/bpf/bpf.c                           |   3 +
>  tools/lib/bpf/bpf.h                           |   3 +
>  tools/lib/bpf/libbpf.c                        |  35 ++
>  tools/lib/bpf/libbpf.h                        |   3 +
>  tools/lib/bpf/libbpf.map                      |   1 +
>  .../test_cgroup_hierarchical_stats.c          | 335 ++++++++++++++++++
>  tools/testing/selftests/bpf/progs/bpf_iter.h  |   7 +
>  .../selftests/bpf/progs/cgroup_vmscan.c       | 211 +++++++++++
>  26 files changed, 1212 insertions(+), 22 deletions(-)
>  create mode 100644 include/linux/bpf-cgroup-subsys.h
>  create mode 100644 kernel/bpf/cgroup_iter.c
>  create mode 100644 kernel/bpf/cgroup_subsys.c
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/test_cgroup_hierarchical_stats.c
>  create mode 100644 tools/testing/selftests/bpf/progs/cgroup_vmscan.c
>
> --
> 2.36.0.512.ge40c2bad7a-goog
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ