linux-kernel - Re: [PATCH] perf, x86: catch spurious interrupts after disabling counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100915184424.GS13563@erda.amd.com>
Date:	Wed, 15 Sep 2010 20:44:24 +0200
From:	Robert Richter <robert.richter@....com>
To:	Stephane Eranian <eranian@...gle.com>
CC:	Ingo Molnar <mingo@...e.hu>, Peter Zijlstra <peterz@...radead.org>,
	Don Zickus <dzickus@...hat.com>,
	"gorcunov@...il.com" <gorcunov@...il.com>,
	"fweisbec@...il.com" <fweisbec@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"ying.huang@...el.com" <ying.huang@...el.com>,
	"ming.m.lin@...el.com" <ming.m.lin@...el.com>,
	"yinghai@...nel.org" <yinghai@...nel.org>,
	"andi@...stfloor.org" <andi@...stfloor.org>
Subject: Re: [PATCH] perf, x86: catch spurious interrupts after disabling
 counters

On 15.09.10 13:32:49, Stephane Eranian wrote:
> > I tried to clear the bit in the active_mask after disabling the
> > counter (writing to the msr), which did not solve it. Shouldn't the
> > counter be disabled immediatly? Maybe clearing the INT bit would have
> > been worked too, but I was not sure about side effects.
> >
> 0 instr1
> 1 instr2
> 2 instr3
> 3 wrmsrl(eventsel0, 0);
> 
> There is skid between the instruction you overflow the counter and
> where the interrupt
> is posted.  If you overflow on instr1, suppose you post the interrupt
> on instr3 which
> is immediately followed by disable. There may a chance you get the
> interrupt even
> though the counter was disabled. I also don't know when the INT bit is
> looked at.

Yes, this could be possible. So, we should assume interrupts may be
delivered after a counter is disabled, which the patch addresses.

> 
> It may be worthwhile trying with:
> 
> static inline void x86_pmu_disable_event(struct perf_event *event)
> {
>         struct hw_perf_event *hwc = &event->hw;
>         (void)checking_wrmsrl(hwc->config_base + hwc->idx, 0);
> }
> 
> to see if it makes a difference.
> 
> 
> >> Does the counter value reflect this?
> >
> > Yes, the disabled bit was cleared after reading the evntsel msr and
> > the ctr value have had about 400 cycles (it could have been
> > overflowed, though we actually can't say since the counter was
> > disabled).
> >
> >> Were you also getting this if you were only measuring at the user level?
> >
> > I tried only
> >
> >  perf record ./hackbench 10
> >
> > which triggered it on my system.
> >
> I suspect that if you do:
> 
> perf record -e cycles:u ./hackbench 10
> 
> It does not happen.

Do you know at which period the counters running for the following?

 perf record ./hackbench 10
 perf record -e cycles -e instructions -e cache-references \
     -e cache-misses -e branch-misses -a -- <cmd>

I couldn't find something about this in the man page.

I will do some further investigations here, esp. with:

* compile order,
* checking_wrmsrl(),
* -e cycles:u

But I can not start with it before next week.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/