linux-kernel - Re: [PATCH -tip] cpuacct: per-cgroup utime/stime statistics

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090318095336.05986cc9.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Wed, 18 Mar 2009 09:53:36 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	bharata@...ux.vnet.ibm.com
Cc:	linux-kernel@...r.kernel.org, Balaji Rao <balajirrao@...il.com>,
	Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Li Zefan <lizf@...fujitsu.com>,
	Paul Menage <menage@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH -tip] cpuacct: per-cgroup utime/stime statistics - v3

On Tue, 17 Mar 2009 11:51:55 +0530
Bharata B Rao <bharata@...ux.vnet.ibm.com> wrote:

> Hi,
> 
> Here is the next version of the cpuacct stime/utime statistics patch.
> 
> Ingo, Could you please consider this for -tip ?
> 
> Changes for v3:
> - Fix a small race in the cpuacct hierarchy walk.
> 
> v2:
> http://lkml.org/lkml/2009/3/12/170
> 
> v1:
> http://lkml.org/lkml/2009/3/10/150
> --
> 
> cpuacct: Add stime and utime statistics
> 
> Add per-cgroup cpuacct controller statistics like the system and user
> time consumed by the group of tasks.
> 
> Signed-off-by: Bharata B Rao <bharata@...ux.vnet.ibm.com>
> Signed-off-by: Balaji Rao <balajirrao@...il.com>
> ---
>  Documentation/cgroups/cpuacct.txt |   17 +++++++
>  kernel/sched.c                    |   92 +++++++++++++++++++++++++++++++++++---
>  2 files changed, 103 insertions(+), 6 deletions(-)
> 
> --- a/Documentation/cgroups/cpuacct.txt
> +++ b/Documentation/cgroups/cpuacct.txt
> @@ -30,3 +30,20 @@ The above steps create a new group g1 an
>  process (bash) into it. CPU time consumed by this bash and its children
>  can be obtained from g1/cpuacct.usage and the same is accumulated in
>  /cgroups/cpuacct.usage also.
> +
> +cpuacct.stat file lists a few statistics which further divide the
> +CPU time obtained by the cgroup into user and system times. Currently
> +the following statistics are supported:
> +
> +utime: Time spent by tasks of the cgroup in user mode.
> +stime: Time spent by tasks of the cgroup in kernel mode.
> +
> +utime and stime are in USER_HZ unit.
> +
> +cpuacct controller uses percpu_counter interface to collect utime and
> +stime. This causes two side effects:
> +
> +- It is theoritically possible to see wrong values for stime and utime.
> +  This is because percpu_counter_read() on 32bit systems is broken.

<snip> Hmm, I don't want to say "BROKEN" but..

> +- It is possible to see slightly outdated values for stime and utime
> +  due to the batch processing nature of percpu_counter.
no objection to here. My customer will ask me "To what extent it delayes ?"
maybe I can answer...

> +static int cpuacct_stats_show(struct cgroup *cgrp, struct cftype *cft,
> +		struct cgroup_map_cb *cb)
> +{
> +	struct cpuacct *ca = cgroup_ca(cgrp);
> +	int i;
> +
> +	for (i = 0; i < CPUACCT_STAT_NSTATS; i++) {
> +		s64 val = percpu_counter_read(&ca->cpustat[i]);
> +		val = cputime_to_clock_t(val);
> +		cb->fill(cb, cpuacct_stat_desc[i], val);
> +	}
> +	return 0;
> +}
> +

No objection to this patch itself, but, Hmm...can this work ?

#ifdef CONFIG_32BIT
/* can be used only when update is not very frequent */
s64 percpu_counter_read_positive_slow(fbc)
{
    s64 ret;
retry:
    /* wait until it seems to be safe */
    smp_mb();
    spin_unlock_wait(&ca->lock);
    ret = fbc->count;
    if (ret < 0)
         goto retry;
    return ret;
}
#else
s64 percpu_counter_read_positive_slow(fbc)
{
   retrun fbc->count;
}
#endif

I wonder why percpu_counter_read_positive() is designed to return 1...

Thanks,
-Kame



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/