linux-kernel - Re: [PATCH 1/2] perf_events: add cgroup support (v8)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 02 Feb 2011 13:46:32 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	balbir@...ux.vnet.ibm.com
Cc:	eranian@...gle.com, linux-kernel@...r.kernel.org, mingo@...e.hu,
	paulus@...ba.org, davem@...emloft.net, fweisbec@...il.com,
	perfmon2-devel@...ts.sf.net, eranian@...il.com,
	robert.richter@....com, acme@...hat.com, lizf@...fujitsu.com,
	Paul Menage <menage@...gle.com>
Subject: Re: [PATCH 1/2] perf_events: add cgroup support (v8)

On Wed, 2011-02-02 at 17:20 +0530, Balbir Singh wrote:
> * Peter Zijlstra <peterz@...radead.org> [2011-02-02 12:29:20]:
> 
> > On Thu, 2011-01-20 at 15:39 +0100, Peter Zijlstra wrote:
> > > On Thu, 2011-01-20 at 15:30 +0200, Stephane Eranian wrote:
> > > > @@ -4259,8 +4261,20 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
> > > >  
> > > >         /* Reassign the task to the init_css_set. */
> > > >         task_lock(tsk);
> > > > +       /*
> > > > +        * we mask interrupts to prevent:
> > > > +        * - timer tick to cause event rotation which
> > > > +        *   could schedule back in cgroup events after
> > > > +        *   they were switched out by perf_cgroup_sched_out()
> > > > +        *
> > > > +        * - preemption which could schedule back in cgroup events
> > > > +        */
> > > > +       local_irq_save(flags);
> > > > +       perf_cgroup_sched_out(tsk);
> > > >         cg = tsk->cgroups;
> > > >         tsk->cgroups = &init_css_set;
> > > > +       perf_cgroup_sched_in(tsk);
> > > > +       local_irq_restore(flags);
> > > >         task_unlock(tsk);
> > > >         if (cg)
> > > >                 put_css_set_taskexit(cg); 
> > > 
> > > So you too need a callback on cgroup change there.. Li, Paul, any chance
> > > we can fix this cgroup_subsys::exit callback? The scheduler code needs
> > > to do funny thing because its in the wrong place as well.
> > 
> > cgroup guys? Shall I just fix this exit thing since the only user seems
> > to be the scheduler and now perf for both of which its unfortunate at
> > best?
> 
> Are you suggesting that the cgroup_exit on task_exit notification should be
> pulled out?


No, just fixed. The callback as it exists isn't useful and leads to
hacks like the above.


> > Balbir, memcontrol.c uses pre_destroy(), I pose that using this method
> > is broken per definition since it makes the cgroup empty notification
> > void.
> >
> 
> We use pre_destroy() to reclaim, so that delete/rmdir() will be able
> to clean up the node/group. I am not sure what you mean by it makes
> the empty notification void and why pre_destroy() is broken?

A quick look at the code looked like it could return -EBUSY (and other
errors), in that case the rmdir of the empty cgroup will fail.

Therefore it can happen that after the last task is removed, and we get
the notification that the cgroup is empty, and we attempt the rmdir we
will fail.

This again means that all such notification handlers must poll state,
which is ridiculous.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/