linux-kernel - Re: [PATCHSET 00/10] perf: Improve cgroup profiling (v5)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200306150514.GE290743@krava>
Date:   Fri, 6 Mar 2020 16:05:14 +0100
From:   Jiri Olsa <jolsa@...hat.com>
To:     Namhyung Kim <namhyung@...nel.org>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Stephane Eranian <eranian@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-perf-users@...r.kernel.org, Tejun Heo <tj@...nel.org>,
        Li Zefan <lizefan@...wei.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [PATCHSET 00/10] perf: Improve cgroup profiling (v5)

On Mon, Feb 24, 2020 at 01:37:39PM +0900, Namhyung Kim wrote:
> Hello,
> 
> This work is to improve cgroup profiling in perf.  Currently it only
> supports profiling tasks in a specific cgroup and there's no way to
> identify which cgroup the current sample belongs to.  So I added
> PERF_SAMPLE_CGROUP to add cgroup id into each sample.  It's a 64-bit
> integer having file handle of the cgroup.  And kernel also generates
> PERF_RECORD_CGROUP event for new groups to correlate the cgroup id and
> cgroup name (path in the cgroup filesystem).  The cgroup id can be
> read from userspace by name_to_handle_at() system call so it can
> synthesize the CGROUP event for existing groups.
> 
> So why do we want this?  Systems running a large number of jobs in
> different cgroups want to profiling such jobs precisely. This includes
> container hosting systems widely used today. Currently perf supports
> namespace tracking but the systems may not use (cgroup) namespace for
> their jobs. Also it'd be more intuitive to see cgroup names (as
> they're given by user or sysadmin) rather than numeric
> cgroup/namespace id even if they use the namespaces.
> 
> From Stephane Eranian:
> > In data centers you care about attributing samples to a job not such
> > much to a process.  A job may have multiple processes which may come
> > and go. The cgroup on the other hand stays around for the entire
> > lifetime of the job. It is much easier to map a cgroup name to a
> > particular job than it is to map a pid back to a job name,
> > especially for offline post-processing.
> 
> Note that this only works for "perf_event" cgroups (obviously) so if
> users are still using cgroup-v1 interface, they need to have same
> hierarchy for subsystem(s) want to profile with it.
> 
>  * Changes from v4:
>   - use CONFIG_CGROUP_PERF
>   - move cgroup tree to perf_env
>   - move cgroup fs utility function to tools/lib/api/fs
>   - use a local buffer and check its size for cgroup systhesis

the perf top tui should all cgroup id as 0 and the headers are
misaligned

	Samples
	Overhead  cgroup id (dev/inode        Pid:Command
	  83.78%  0/0x0                 N/A     6508:perf
	   8.82%  0/0x0                 N/A        0:swapper
	   2.59%  0/0x0                 N/A     6466:perf
	   1.69%  0/0x0                 N/A     6509:perf-top-UI
	   0.56%  0/0x0                 N/A       12:rcu_sched
	   0.29%  0/0x0                 N/A      429:kworker/0:2-mm_
	   0.15%  0/0x0                 N/A     1416:sshd
	   0.12%  0/0x0                 N/A      187:migration/35


jirka