linux-kernel - Re: [RFC PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v4)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTikcc_egoo==_R6usUdCm07Cuog7gD4DEjX8rfaP@mail.gmail.com>
Date:	Thu, 7 Oct 2010 15:45:20 +0200
From:	stephane eranian <eranian@...glemail.com>
To:	Li Zefan <lizf@...fujitsu.com>
Cc:	eranian@...gle.com, linux-kernel@...r.kernel.org,
	peterz@...radead.org, mingo@...e.hu, paulus@...ba.org,
	davem@...emloft.net, fweisbec@...il.com,
	perfmon2-devel@...ts.sf.net, robert.richter@....com,
	acme@...hat.com
Subject: Re: [RFC PATCH 1/2] perf_events: add support for per-cpu per-cgroup
 monitoring (v4)

On Thu, Oct 7, 2010 at 3:20 AM, Li Zefan <lizf@...fujitsu.com> wrote:
> Stephane Eranian wrote:
>> This kernel patch adds the ability to filter monitoring based on
>> container groups (cgroups). This is for use in per-cpu mode only.
>>
>> The cgroup to monitor is passed as a file descriptor in the pid
>> argument to the syscall. The file descriptor must be opened to
>> the cgroup name in the cgroup filesystem. For instance, if the
>> cgroup name is foo and cgroupfs is mounted in /cgroup, then the
>> file descriptor is opened to /cgroup/foo. Cgroup mode is
>> activated by passing PERF_FLAG_PID_CGROUP into the flags argument
>> to the syscall.
>>
>> Signed-off-by: Stephane Eranian <eranian@...gle.com>
>>
>> ---
>>
>> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
>> index 709dfb9..67cf276 100644
>> --- a/include/linux/cgroup.h
>> +++ b/include/linux/cgroup.h
>> @@ -623,6 +623,8 @@ bool css_is_ancestor(struct cgroup_subsys_state *cg,
>>  unsigned short css_id(struct cgroup_subsys_state *css);
>>  unsigned short css_depth(struct cgroup_subsys_state *css);
>>
>> +struct cgroup_subsys_state *cgroup_css_from_dir(struct file *f, int id);
>> +
>>  #else /* !CONFIG_CGROUPS */
>>
>>  static inline int cgroup_init_early(void) { return 0; }
>> diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
>> index ccefff0..93f86b7 100644
>> --- a/include/linux/cgroup_subsys.h
>> +++ b/include/linux/cgroup_subsys.h
>> @@ -65,4 +65,8 @@ SUBSYS(net_cls)
>>  SUBSYS(blkio)
>>  #endif
>>
>> +#ifdef CONFIG_PERF_EVENTS
>> +SUBSYS(perf)
>> +#endif
>> +
>>  /* */
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 61b1e2d..ad79f0a 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -454,6 +454,7 @@ enum perf_callchain_context {
>>
>>  #define PERF_FLAG_FD_NO_GROUP        (1U << 0)
>>  #define PERF_FLAG_FD_OUTPUT  (1U << 1)
>> +#define PERF_FLAG_PID_CGROUP (1U << 2) /* pid=cgroup id, per-cpu mode */
>>
>>  #ifdef __KERNEL__
>>  /*
>> @@ -461,6 +462,7 @@ enum perf_callchain_context {
>>   */
>>
>>  #ifdef CONFIG_PERF_EVENTS
>> +# include <linux/cgroup.h>
>>  # include <asm/perf_event.h>
>>  # include <asm/local64.h>
>>  #endif
>> @@ -698,6 +700,18 @@ struct swevent_hlist {
>>  #define PERF_ATTACH_CONTEXT  0x01
>>  #define PERF_ATTACH_GROUP    0x02
>>
>> +#ifdef CONFIG_CGROUPS
>> +struct perf_cgroup_time {
>> +     u64 time;
>> +     u64 timestamp;
>> +};
>> +
>> +struct perf_cgroup {
>> +     struct cgroup_subsys_state css;
>> +     struct perf_cgroup_time *time;
>> +};
>
> Can we avoid adding this perf cgroup subsystem? It has 2 disavantages:
>
Well, I need to maintain some timing information for each cgroup. This has
to be stored somewhere.

> - If one mounted cgroup fs without perf cgroup subsys, he can't monitor it.

That's unfortunately true ;-)

> - If there are several different cgroup mount points, only one can be
>  monitored.
>
> To choose which cgroup hierarchy to monitor, hierarchy id can be passed
> from userspace, which is the 2nd column below:
>
Ok, I will investigate this. As long as the hierarchy id is unique AND it can be
searched, then we can use it. Using /proc is fine with me.

> $ cat /proc/cgroups
> #subsys_name    hierarchy       num_cgroups     enabled
> debug   0       1       1
> net_cls 0       1       1
>
>> +#endif
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/