[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090311181302.77c1de0b.kamezawa.hiroyu@jp.fujitsu.com>
Date: Wed, 11 Mar 2009 18:13:02 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: bharata@...ux.vnet.ibm.com
Cc: linux-kernel@...r.kernel.org, Balaji Rao <balajirrao@...il.com>,
Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Li Zefan <lizf@...fujitsu.com>,
Paul Menage <menage@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [RFC PATCH] cpuacct: per-cgroup utime/stime statistics - v1
On Wed, 11 Mar 2009 14:23:16 +0530
Bharata B Rao <bharata@...ux.vnet.ibm.com> wrote:
> On Wed, Mar 11, 2009 at 09:38:12AM +0900, KAMEZAWA Hiroyuki wrote:
> > On Tue, 10 Mar 2009 18:12:08 +0530
> > Bharata B Rao <bharata@...ux.vnet.ibm.com> wrote:
> >
> > > Hi,
> > >
> > > Based on the comments received during my last post
> > > (http://lkml.org/lkml/2009/2/25/129), here is a fresh attempt
> > > to get per-cgroup utime/stime statistics as part of cpuacct controller.
> > >
> > > This patch adds a new file cpuacct.stat which displays two stats:
> > > utime and stime. I wasn't too sure about the usefulness of providing
> > > per-cgroup guest and steal times and hence not including them here.
> > >
> > > Note that I am using percpu_counter for collecting these two stats.
> > > Since percpu_counter subsystem doesn't protect the readside, readers could
> > > theoritically obtain incorrect values for these stats on 32bit systems.
> >
> > Using percpu_counter_read() means that .. but is it okay to ignore "batch"
> > number ? (see FBC_BATCH)
>
> I would think it might be ok with the understanding that read is not
> a frequent operation. The default value of percpu_counter_batch is 32.
> Ideally it should have been possible to set this value independently
> for each percpu_counter. That way, users could have chosen an appropriate
> batch value for their counter based on the usage pattern of their
> counters.
>
Hmm, in my point of view, stime/utime's unit is mili second and it's enough
big to be expected as "correct" value.
If read is not frequent, I love precise value.
> >
> >
> > > I hope occasional wrong values is not too much of a concern for
> > > statistics like this. If it is a problem, we have to either fix
> > > percpu_counter or do it all by ourselves as Kamezawa attempted
> > > for cpuacct.usage (http://lkml.org/lkml/2009/3/4/14)
> > >
> > Hmm, percpu_counter_sum() is bad ?
>
> It is slow and it doesn't do exactly what we want. It just adds the
> 32bit percpu counters to the global 64bit counter under lock and returns
> the result. But it doesn't clear the 32bit percpu counters after accummulating
> them in the 64bit counter.
>
> If it is ok to be a bit slower on the read side, we could have something
> like percpu_counter_read_slow() which would do what percpu_counter_sum()
> does and in addition clear the 32bit percpu counters. Will this be
> acceptable ? It slows down the read side, but will give accurate count.
> This might slow down the write side also(due to contention b/n readers
> and writers), but I guess due to batching the effect might not be too
> pronounced. Should we be going this way ?
>
I like precise one. Maybe measuring overhead and comparing them and making a
decision is a usual way to go.
This accounting is once-a-tick event. (right?) So, how about measuring read-side
over head ?
> >
> > BTW, I'm not sure but don't we need special handling if
> > CONFIG_VIRT_CPU_ACCOUNTING=y ?
>
> AFAICS no. Architectures which define CONFIG_VIRT_CPU_ACCOUNTING end up calling
> account_{system,user}_time() where we have placed our hooks for
> cpuacct charging. So even on such architectures we should be able to
> get correct per-cgroup stime and utime.
>
ok,
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists