linux-kernel - Re: [PATCH 1/2] perf_events: add cgroup support (v8)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTi=+W-BXHFG94MeOnqV82HxtqtyDwEt1RCOUMrcq@mail.gmail.com>
Date:	Mon, 7 Feb 2011 11:29:08 -0800
From:	Paul Menage <menage@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	balbir@...ux.vnet.ibm.com, eranian@...gle.com,
	linux-kernel@...r.kernel.org, mingo@...e.hu, paulus@...ba.org,
	davem@...emloft.net, fweisbec@...il.com,
	perfmon2-devel@...ts.sf.net, eranian@...il.com,
	robert.richter@....com, acme@...hat.com, lizf@...fujitsu.com
Subject: Re: [PATCH 1/2] perf_events: add cgroup support (v8)

On Wed, Feb 2, 2011 at 4:46 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Wed, 2011-02-02 at 17:20 +0530, Balbir Singh wrote:
>> * Peter Zijlstra <peterz@...radead.org> [2011-02-02 12:29:20]:
>>
>> > On Thu, 2011-01-20 at 15:39 +0100, Peter Zijlstra wrote:
>> > > On Thu, 2011-01-20 at 15:30 +0200, Stephane Eranian wrote:
>> > > > @@ -4259,8 +4261,20 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
>> > > >
>> > > >         /* Reassign the task to the init_css_set. */
>> > > >         task_lock(tsk);
>> > > > +       /*
>> > > > +        * we mask interrupts to prevent:
>> > > > +        * - timer tick to cause event rotation which
>> > > > +        *   could schedule back in cgroup events after
>> > > > +        *   they were switched out by perf_cgroup_sched_out()
>> > > > +        *
>> > > > +        * - preemption which could schedule back in cgroup events
>> > > > +        */
>> > > > +       local_irq_save(flags);
>> > > > +       perf_cgroup_sched_out(tsk);
>> > > >         cg = tsk->cgroups;
>> > > >         tsk->cgroups = &init_css_set;
>> > > > +       perf_cgroup_sched_in(tsk);
>> > > > +       local_irq_restore(flags);
>> > > >         task_unlock(tsk);
>> > > >         if (cg)
>> > > >                 put_css_set_taskexit(cg);
>> > >
>> > > So you too need a callback on cgroup change there.. Li, Paul, any chance
>> > > we can fix this cgroup_subsys::exit callback? The scheduler code needs
>> > > to do funny thing because its in the wrong place as well.
>> >
>> > cgroup guys? Shall I just fix this exit thing since the only user seems
>> > to be the scheduler and now perf for both of which its unfortunate at
>> > best?
>>
>> Are you suggesting that the cgroup_exit on task_exit notification should be
>> pulled out?
>
>
> No, just fixed. The callback as it exists isn't useful and leads to
> hacks like the above.
>
>
>> > Balbir, memcontrol.c uses pre_destroy(), I pose that using this method
>> > is broken per definition since it makes the cgroup empty notification
>> > void.
>> >
>>
>> We use pre_destroy() to reclaim, so that delete/rmdir() will be able
>> to clean up the node/group. I am not sure what you mean by it makes
>> the empty notification void and why pre_destroy() is broken?
>
> A quick look at the code looked like it could return -EBUSY (and other
> errors), in that case the rmdir of the empty cgroup will fail.
>
> Therefore it can happen that after the last task is removed, and we get
> the notification that the cgroup is empty, and we attempt the rmdir we
> will fail.
>
> This again means that all such notification handlers must poll state,
> which is ridiculous.
>

Not necessarily - we could make it that a failed rmdir() sets a bit
that causes a notification again once the final refcount is dropped
again on the cgroup.

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/