linux-kernel - Re: perf : fuzzer-related NMI lockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20130730193838.GC23299@pd.tnic>
Date:	Tue, 30 Jul 2013 21:38:38 +0200
From:	Borislav Petkov <bp@...en8.de>
To:	Vince Weaver <vincent.weaver@...ne.edu>
Cc:	linux-kernel@...r.kernel.org,
	Peter Zijlstra <peterz@...radead.org>,
	Paul Mackerras <paulus@...ba.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
	trinity@...r.kernel.org
Subject: Re: perf : fuzzer-related NMI lockup

On Tue, Jul 30, 2013 at 03:01:27PM -0400, Vince Weaver wrote:
> Hello
> 
> so my perf_fuzzer has been causing problems again.
> 
> After running a while all login shells on the system (even unrelated 
> local ones) get killed.  Nothing is logged when this happens and it 
> doesn't appear to be OOM related.
> 
> In an attempt to find out what was going on I ran the fuzzer with "nohup"
> which led to the following NMI lockup which looks perf related.  The
> system became unusable after this.
> 
> The first WARNING is I think a known issue but I'm including it in the 
> dump in case it is related.  It's the NMI lockup that is the problem.
> 
> There was possibly some sort of RCU message printed to the screen also 
> that didn't make it to the logs but I wasn't able to write it down in 
> time.
> 
> This is on a recent ivybridge mac-mini running 3.11-rc3
> 
> Jul 30 11:08:28 mac-mini kernel: [  651.209212] hrtimer: interrupt took 1152 ns
> Jul 30 11:08:50 mac-mini kernel: [  673.441360] perf samples too long (2557 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> Jul 30 11:08:58 mac-mini kernel: [  680.886547] perf samples too long (5003 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
> Jul 30 11:08:58 mac-mini kernel: [  681.401917] perf samples too long (10002 > 10000), lowering kernel.perf_event_max_sample_rate to 12500

Interesting, saw a similar thing today while running

perf top --stdio -a

[47314.677201] perf samples too long (2505 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[47314.686347] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.148 msecs
[47315.946675] perf samples too long (5009 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[47315.955825] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.154 msecs
[47391.116117] Uhhuh. NMI received for unknown reason 21 on CPU 0.
[47391.122034] Do you have a strange power saving mode enabled?
[47391.127731] Dazed and confused, but trying to continue
[53627.692616] Uhhuh. NMI received for unknown reason 31 on CPU 0.
[53627.698547] Do you have a strange power saving mode enabled?
[53627.704202] Dazed and confused, but trying to continue
[64212.289657] usb 1-1.2: USB disconnect, device number 4

along with strange "forgotten" NMIs firing later. Machine is still
running normally after that though.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/