[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090311085316.GA5592@in.ibm.com>
Date: Wed, 11 Mar 2009 14:23:16 +0530
From: Bharata B Rao <bharata@...ux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc: linux-kernel@...r.kernel.org, Balaji Rao <balajirrao@...il.com>,
Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Li Zefan <lizf@...fujitsu.com>,
Paul Menage <menage@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [RFC PATCH] cpuacct: per-cgroup utime/stime statistics - v1
On Wed, Mar 11, 2009 at 09:38:12AM +0900, KAMEZAWA Hiroyuki wrote:
> On Tue, 10 Mar 2009 18:12:08 +0530
> Bharata B Rao <bharata@...ux.vnet.ibm.com> wrote:
>
> > Hi,
> >
> > Based on the comments received during my last post
> > (http://lkml.org/lkml/2009/2/25/129), here is a fresh attempt
> > to get per-cgroup utime/stime statistics as part of cpuacct controller.
> >
> > This patch adds a new file cpuacct.stat which displays two stats:
> > utime and stime. I wasn't too sure about the usefulness of providing
> > per-cgroup guest and steal times and hence not including them here.
> >
> > Note that I am using percpu_counter for collecting these two stats.
> > Since percpu_counter subsystem doesn't protect the readside, readers could
> > theoritically obtain incorrect values for these stats on 32bit systems.
>
> Using percpu_counter_read() means that .. but is it okay to ignore "batch"
> number ? (see FBC_BATCH)
I would think it might be ok with the understanding that read is not
a frequent operation. The default value of percpu_counter_batch is 32.
Ideally it should have been possible to set this value independently
for each percpu_counter. That way, users could have chosen an appropriate
batch value for their counter based on the usage pattern of their
counters.
>
>
> > I hope occasional wrong values is not too much of a concern for
> > statistics like this. If it is a problem, we have to either fix
> > percpu_counter or do it all by ourselves as Kamezawa attempted
> > for cpuacct.usage (http://lkml.org/lkml/2009/3/4/14)
> >
> Hmm, percpu_counter_sum() is bad ?
It is slow and it doesn't do exactly what we want. It just adds the
32bit percpu counters to the global 64bit counter under lock and returns
the result. But it doesn't clear the 32bit percpu counters after accummulating
them in the 64bit counter.
If it is ok to be a bit slower on the read side, we could have something
like percpu_counter_read_slow() which would do what percpu_counter_sum()
does and in addition clear the 32bit percpu counters. Will this be
acceptable ? It slows down the read side, but will give accurate count.
This might slow down the write side also(due to contention b/n readers
and writers), but I guess due to batching the effect might not be too
pronounced. Should we be going this way ?
>
> BTW, I'm not sure but don't we need special handling if
> CONFIG_VIRT_CPU_ACCOUNTING=y ?
AFAICS no. Architectures which define CONFIG_VIRT_CPU_ACCOUNTING end up calling
account_{system,user}_time() where we have placed our hooks for
cpuacct charging. So even on such architectures we should be able to
get correct per-cgroup stime and utime.
Regards,
Bharata.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists