linux-kernel - Re: [PATCH 1/2] perf_events: add cgroup support (v8)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTimBypM3rr4QS3NcnAXcXUH-2-fE7mKG12NuX1Vv@mail.gmail.com>
Date:	Thu, 20 Jan 2011 15:46:51 +0100
From:	Stephane Eranian <eranian@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu, paulus@...ba.org,
	davem@...emloft.net, fweisbec@...il.com,
	perfmon2-devel@...ts.sf.net, eranian@...il.com,
	robert.richter@....com, acme@...hat.com, lizf@...fujitsu.com,
	Paul Menage <menage@...gle.com>
Subject: Re: [PATCH 1/2] perf_events: add cgroup support (v8)

Peter,

Note that I also had to drop my optimization which was handling cgroup switch
from perf_event_task_sched_out() (and which had to deal with PF_EXITING).
I realized there was an issue in sampling mode during context switch. If you are
measuring at the kernel level, then you may get samples from the previous task
(which may not be in your cgroup) if a counter overflows in the middle
of the context
switch because you've switched cgroup before "current" is actually
updated. Thus,
the perf_event interrupt handler associates the sample to the old tasks.

So now, I switch out everything from perf_event_task_sched_out() and
wait until perf_event_task_sched_in() to re-examine the situation for the
new task. That's the only way to maintain consistency with "current".


On Thu, Jan 20, 2011 at 3:39 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Thu, 2011-01-20 at 15:30 +0200, Stephane Eranian wrote:
>> @@ -4259,8 +4261,20 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
>>
>>         /* Reassign the task to the init_css_set. */
>>         task_lock(tsk);
>> +       /*
>> +        * we mask interrupts to prevent:
>> +        * - timer tick to cause event rotation which
>> +        *   could schedule back in cgroup events after
>> +        *   they were switched out by perf_cgroup_sched_out()
>> +        *
>> +        * - preemption which could schedule back in cgroup events
>> +        */
>> +       local_irq_save(flags);
>> +       perf_cgroup_sched_out(tsk);
>>         cg = tsk->cgroups;
>>         tsk->cgroups = &init_css_set;
>> +       perf_cgroup_sched_in(tsk);
>> +       local_irq_restore(flags);
>>         task_unlock(tsk);
>>         if (cg)
>>                 put_css_set_taskexit(cg);
>
> So you too need a callback on cgroup change there.. Li, Paul, any chance
> we can fix this cgroup_subsys::exit callback? The scheduler code needs
> to do funny thing because its in the wrong place as well.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/