lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20111018152039.GB11530@mgebm.net>
Date:	Tue, 18 Oct 2011 11:20:39 -0400
From:	Eric B Munson <emunson@...bm.net>
To:	Stephane Eranian <eranian@...gle.com>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>, mingo@...e.hu,
	anton@...ba.org, linux-kernel@...r.kernel.org, paulus@...ba.org,
	hbabu@...ibm.com
Subject: Re: Oprofile Regression Caused by commit
 e5d1367f17ba6a6fed5fd8b74e4d5720923e0c25 on PPC

On Tue, 18 Oct 2011, Stephane Eranian wrote:

> Hi,
> 
> I suspect it's because of this chunk:
>         rcu_read_lock();
> 
>         list_for_each_entry_rcu(pmu, &pmus, entry) {
> -
>                 cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
> 
> -               perf_pmu_disable(cpuctx->ctx.pmu);
> -
>                 /*
>                  * perf_cgroup_events says at least one
>                  * context on this CPU has cgroup events.
> @@ -353,6 +366,8 @@ void perf_cgroup_switch(struct task_struct *task, int mode)
>                  * events for a context.
>                  */
>                 if (cpuctx->ctx.nr_cgroups > 0) {
> +                       perf_ctx_lock(cpuctx, cpuctx->task_ctx);
> +                       perf_pmu_disable(cpuctx->ctx.pmu);
> 
> 
> In other words, you don't call perf_pmu_disable() unless you know
> you have cgroup events.
> 
> Without that, I think you will touch the PMU on cgroup switch and
> that night conflict with another subsystem using the PMU, e.g. OProfile.

Exactly, the inital bug was being caused by something in perf touching the MMCR0
register without reserving the hardware first (or noticing that oprofile had
already done so).  So oprofile runs would show abnormally low event counts or
counts that were not there if you were really unlucky.  This patch fixes the
problem because the MMCR0 register will not be touched unless there was there
were cgroup events active.

> 
> 
> On Tue, Oct 18, 2011 at 4:43 PM, Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> > On Tue, 2011-10-18 at 09:53 -0400, Eric B Munson wrote:
> >> On Wed, 12 Oct 2011, Stephane Eranian wrote:
> >>
> >> > Could be:
> >> >    a8d757e perf events: Fix slow and broken cgroup context switch code
> >> >
> >>
> >> Thanks for the pointer, but the fix was in:
> >>     facc4307 perf: Optimize event scheduling locking
> >>
> >> This might be a candidate for stable given that oprofile is broken without it.
> >
> > I might feel much more confident about recommending that if someone
> > could explain why that patches fixes what exact problem.
> >
> 

Download attachment "signature.asc" of type "application/pgp-signature" (837 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ