linux-kernel - Re: [PATCH] perf, x86: catch spurious interrupts after disabling counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100916065324.GA6470@lenovo>
Date:	Thu, 16 Sep 2010 10:53:24 +0400
From:	Cyrill Gorcunov <gorcunov@...il.com>
To:	Robert Richter <robert.richter@....com>
Cc:	Stephane Eranian <eranian@...gle.com>, Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	Don Zickus <dzickus@...hat.com>,
	"fweisbec@...il.com" <fweisbec@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"ying.huang@...el.com" <ying.huang@...el.com>,
	"ming.m.lin@...el.com" <ming.m.lin@...el.com>,
	"yinghai@...nel.org" <yinghai@...nel.org>,
	"andi@...stfloor.org" <andi@...stfloor.org>
Subject: Re: [PATCH] perf, x86: catch spurious interrupts after disabling
	counters

On Thu, Sep 16, 2010 at 12:10:41AM +0200, Robert Richter wrote:
> On 15.09.10 13:40:12, Cyrill Gorcunov wrote:
> > Yeah, already noted from your previous email. Perhaps we might
> > do a bit simplier approach then -- in nmi handler were we mark
> > "next nmi" we could take into account not "one next" nmi but
> > sum of handled counters minus one being just handled (of course
> > cleaning this counter if new "non spurious" nmi came in), can't
> > say I like this approach but just a thought.
> 
> If we disable a counter, it might still trigger an interrupt which we
> cannot detect. Thus, if a running counter is deactivated, we must
> count it as handled in the nmi handler.
> 
> Working with a sum is not possible, because a disabled counter may or
> *may not* trigger an interrupt. We cannot predict the number of
> counters that will be handled.
> 
> Dealing with the "next nmi" is also not handy here. Spurious nmis are
> caused then stopping a counter. Since this is done outside the nmi
> handler, we would then start touching the "next nmi" also outside the
> handler. This might be more complex because we then have to deal with
> locking or atomic access. We shouldn't do that.
> 
> -Robert
>

OK, I see what you mean Robert. Btw, when you reorder cpu_active_mask access
and wrmsr did you try also additional read after write of msr? ie like

	wrmsr
	barrier() // just to be sure gcc would not reorder it
	rdmsr
	clear cpu_active_mask

wonders if it did the trick

	-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/