[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100922043424.GH6676@balbir.in.ibm.com>
Date: Wed, 22 Sep 2010 10:04:24 +0530
From: Balbir Singh <balbir@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Stephane Eranian <eranian@...gle.com>,
linux-kernel@...r.kernel.org, mingo@...e.hu, paulus@...ba.org,
davem@...emloft.net, fweisbec@...il.com,
perfmon2-devel@...ts.sf.net, eranian@...il.com,
robert.richter@....com, acme@...hat.com,
Paul Menage <menage@...gle.com>, Li Zefan <lizf@...fujitsu.com>
Subject: Re: [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup
monitoring (v3)
* Peter Zijlstra <peterz@...radead.org> [2010-09-21 18:27:27]:
> On Tue, 2010-09-21 at 18:17 +0200, Stephane Eranian wrote:
> > On Tue, Sep 21, 2010 at 4:03 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> > > On Tue, 2010-09-21 at 15:38 +0200, Stephane Eranian wrote:
> > >> > Hmm, indeed. One thing we can do about that is move perf into the
> > >> > cgroup, create the counter (disabled) using self to identify the cgroup,
> > >> > move perf back to where it came from, and enable the counter.
> > >> >
> > >> Yes, that's another possibility. I wonder if there are any non-obvious
> > >> difficulties with this approach.
> > >
> > > Yes, there is, but I think we can fix it. The problem with moving perf
> > > itself around is that perf is not a fully dormant process and can thus
> > > interact with the cgroup state.
> > >
> > I was thinking about memory accounting for instance.
>
> I think the memory controller only accounts things when the process
> actually touches something. A process that never wakes will never touch
> anything.
That understanding is correct, but the whole approach sounds more
complex due to several subsystems involved, the expectation is that
we'll move perf to all the correct cgroups for each subsystem.
>
> > > If we were to fork a child that's simply sitting idle in waitpid() (or
> > > any other blocking syscall) we can move that around cgroup without
> > > affecting the cgroup itself.
> >
> > But then things get a bit more complicated because the perf_event_open()
> > has to be done in that child. File descriptors created in child processes
> > and not shared with their parent. You'd have to pass file descriptors around.
> > That seems overly complicated.
>
> Uhm, no the trick is that the child remains absolutely dormant and
> therefore doesn't accrue any accounting, all you need is a known task in
> the cgroup, the parent can then specify the child pid to identify the
> group.
>
> Once you've opened the counter, you can move the kid out and kill it.
> Note that moving it out of the cgroup before killing it ensure it never
> wakes up inside that cgroup.
What the benefits of this complexity, not chaning perf_event_attr?
--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists