linux-kernel - Re: [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Thu, 2 Sep 2010 10:17:10 +0200
From:	Stephane Eranian <eranian@...gle.com>
To:	Lin Ming <lin@...g.vg>
Cc:	linux-kernel@...r.kernel.org, peterz@...radead.org, mingo@...e.hu,
	paulus@...ba.org, davem@...emloft.net, fweisbec@...il.com,
	perfmon2-devel@...ts.sf.net, eranian@...il.com
Subject: Re: [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring

On Thu, Sep 2, 2010 at 5:53 AM, Lin Ming <lin@...g.vg> wrote:
> On Tue, Aug 31, 2010 at 11:25 PM, Stephane Eranian <eranian@...gle.com> wrote:
>> This series of patches adds per-container (cgroup) filtering capability
>> to per-cpu monitoring. In other words, we can monitor all threads belonging
>> to a specific cgroup and running on a specific CPU.
>>
>> This is useful to measure what is going on inside a cgroup. Something that
>> cannot easily and cheaply be achieved with either per-thread or per-cpu mode.
>> Cgroups can span multiple CPUs. CPUs can be shared between cgroups. Cgroups
>> can have lots of threads. Threads can come and go during a measurement.
>>
>> To measure per-cgroup today requires using per-thread mode and attaching to
>> all the current threads inside a cgroup and tracking new threads. That would
>> require scanning of /proc/PID, which is subject to race conditions, and
>> creating an event for each thread, each event requiring kernel memory.
>>
>> The approach taken by this patch is to leverage the per-cpu mode by simply
>> adding a filtering capability on context switch only when necessary. That
>> way the amount of kernel memory used remains bound by the number of CPUs.
>> We also do not have to scan /proc. We are only interested in cgroup level
>> counts, not per-thread.
>>
>> The cgroup to monitor is designated by passing a file descriptor opened
>> on a new per-cgroup file in the cgroup filesystem (perf_event.perf). The
>> option must be activated by setting perf_event_attr.cgroup=1 and passing
>> a valid file descriptor in perf_event_attr.cgroup_fd. Those are the only
>> two ABI extensions.
>>
>> The patch also includes changes to the perf tool to make use of cgroup
>> filtering. Both perf stat and perf record have been extended to support
>> cgroup via a new -G option. The cgroup is specified per event:
>>
>> $ perf stat -a -e cycles:u,cycles:u -G test1,test2 -- sleep 1
>>  Performance counter stats for 'sleep 1':
>>         2368881622  cycles                   test1
>>                  0  cycles                   test2
>>        1.001938136  seconds time elapsed
>
> I have tried this new feature. Cool!
>
> perf stat [<options>] [<command>]
>
> Is the command ("sleep 1" in above example) also counted?
>
If it runs in the cgroup that is measured then yes. Not that it will
do much.

I am working on a second version of the patch that will correct
the issue with timing, and in particular time_enabled. In cgroup
mode, it need to count the time the cgroup was active, and not
wall-clock. That will make the scaling more meaningful.

> Thanks,
> Lin Ming
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/