lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100825202458.GE14874@lenovo>
Date:	Thu, 26 Aug 2010 00:24:58 +0400
From:	Cyrill Gorcunov <gorcunov@...il.com>
To:	Don Zickus <dzickus@...hat.com>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Robert Richter <robert.richter@....com>,
	Peter Zijlstra <peterz@...radead.org>,
	Lin Ming <ming.m.lin@...el.com>,
	"fweisbec@...il.com" <fweisbec@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Huang, Ying" <ying.huang@...el.com>,
	Yinghai Lu <yinghai@...nel.org>,
	Andi Kleen <andi@...stfloor.org>
Subject: Re: [PATCH -v3] perf, x86: try to handle unknown nmis with running
	perfctrs

On Wed, Aug 25, 2010 at 04:11:06PM -0400, Don Zickus wrote:
...
> >  Uhhuh. NMI received for unknown reason 00 on CPU 15.
> >  Do you have a strange power saving mode enabled?
> >  Dazed and confused, but trying to continue
> 
> So I found a Nehalem box that can reliably reproduce Ingo's problem using
> something as simple 'perf top'.  But like above, I am noticing the
> samething, an extra NMI(PMI??) that comes out of nowhere.
> 
> Looking at the data above the delta between nmis is very small compared to
> the other nmis.  It almost suggests that this is an extra PMI.
> Considering there is already two cpu errata discussing extra PMIs under
> certain configurations, I wouldn't be surprised if this was a third.
> 
> Cheers,
> Don
> 

Oh. I'm not sure if it would be a good idea at all but maybe we could
use kind of Robert's idea about "pmu nmi relaxing time" ie some time
slice in which we treat nmi's as being from pmu, but not arbitrary number
but equal to the number of PMI turned off. Say we handle NMI and found
that 4 events are overflowed, we clear them, arm timer and wait for
3 unknow nmis to happen, if they are not happening during some time
period we clear this waitqueue, if they happen or partially happen
- we destroy the timer. Ie almost the same as Robert's idea but
without tsc? Just a thought.

	-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ