linux-kernel - Re: [PATCH] perf_events: improve x86 event scheduling (v6 incremental)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1264440342.4283.1936.camel@laptop>
Date:	Mon, 25 Jan 2010 18:25:42 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	eranian@...il.com
Cc:	eranian@...gle.com, linux-kernel@...r.kernel.org, mingo@...e.hu,
	paulus@...ba.org, davem@...emloft.net, fweisbec@...il.com,
	perfmon2-devel@...ts.sf.net
Subject: Re: [PATCH] perf_events: improve x86 event scheduling (v6 
 incremental)

On Mon, 2010-01-25 at 18:12 +0100, stephane eranian wrote:
> On Fri, Jan 22, 2010 at 9:27 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Thu, 2010-01-21 at 17:39 +0200, Stephane Eranian wrote:
> >> @@ -1395,40 +1430,28 @@ void hw_perf_enable(void)
> >>                  * apply assignment obtained either from
> >>                  * hw_perf_group_sched_in() or x86_pmu_enable()
> >>                  *
> >> -                * step1: save events moving to new counters
> >> -                * step2: reprogram moved events into new counters
> >> +                * We either re-enable or re-program and re-enable.
> >> +                * All events are disabled by the time we come here.
> >> +                * That means their state has been saved already.
> >>                  */
> >
> > I'm not seeing how it is true.
> 
> > Suppose a core2 with counter0 active counting a non-restricted event,
> > say cpu_cycles. Then we do:
> >
> > perf_disable()
> >  hw_perf_disable()
> >    intel_pmu_disable_all
> >      wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0);
> >
> everything is disabled globally, yet individual counter0 is not.
> But that's enough to stop it.
> 
> > ->enable(MEM_LOAD_RETIRED) /* constrained to counter0 */
> >  x86_pmu_enable()
> >    collect_events()
> >    x86_schedule_events()
> >    n_added = 1
> >
> >    /* also slightly confused about this */
> >    if (hwc->idx != -1)
> >      x86_perf_event_set_period()
> >
> 
> In x86_pmu_enable(), we have not yet actually assigned the
> counter to hwc->idx. This is only accomplished by hw_perf_enable().
> Yet, x86_perf_event_set_period() is going to write the MSR.
> 
> My understanding is that you never call enable(event) in code
> outside of a perf_disable()/perf_enable() section.

That should be so yes, last time I verified that is was. Hence I'm a bit
puzzled by that set_period(), hw_perf_enable() will assign ->idx and do
set_period() so why also do it here...

> > perf_enable()
> >  hw_perf_enable()
> >
> >    /* and here we'll assign the new event to counter0
> >     * except we never disabled it... */
> >
> You will have two events, scheduled, cycles in counter1
> and mem_load_retired in counter0. Neither hwc->idx
> will match previous state and thus both will be rewritten.

And by programming mem_load_retires you just wiped the counter value of
the cycle counter, there should be an x86_perf_event_update() in between
stopping the counter and moving that configuration.

> I think the case you are worried about is different. It is the
> case where you would move an event to a new counter
> without replacing it with a new event. Given that the individual
> MSR.en would still be 1 AND that enable_all() enables all
> counters (even the ones not actively used), then we would
> get a runaway counter so to speak.
> 
> It seems a solution would be to call x86_pmu_disable() before
> assigning an event to a new counter for all events which are
> moving. This is because we cannot assume all events have been
> previously disabled individually. Something like
> 
> if (!match_prev_assignment(hwc, cpuc, i)) {
>    if (hwc->idx != -1)
>       x86_pmu.disable(hwc, hwc->idx);
>    x86_assign_hw_event(event, cpuc, cpuc->assign[i]);
>    x86_perf_event_set_period(event, hwc, hwc->idx);
> }

Yes and no, my worry is not that its not counting, but that we didn't
store the actual counter value before over-writing it with the new
configuration.

As to your suggestion,
 1) we would have to do x86_pmu_disable() since that does
x86_perf_event_update().
 2) I worried about the case where we basically switch two counters,
there we cannot do the x86_perf_event_update() in a single pass since
programming the first counter will destroy the value of the second.

Now possibly the scenario in 2 isn't possible because the event
scheduling is stable enough for this never to happen, but I wasn't
feeling too sure about that, so I skipped this part for now.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/