lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 2 Aug 2016 15:31:28 +0200 (CEST)
From:	Jiri Kosina <jikos@...nel.org>
To:	Peter Zijlstra <peterz@...radead.org>
cc:	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: x86 PMU broken in current Linus' tree

On Tue, 2 Aug 2016, Peter Zijlstra wrote:

> > With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR 
> > write warning during bootup, and kernel panic when shutting PMUs down 
> > during poweroff.
> > 
> > The MSR warning is below, the camera capture of the poweroff panic can be 
> > found at
> > 
> > 	http://www.jikos.cz/jikos/junk/pmu-panic.jpg
> > 
> > The last previous kernel version that I've booted on this particular 
> > machine was 4.7.0-rc4, and it had neither of those symptoms, so I can 
> > eventually bisect if needed.
> > 
> > === [ snip ] ==
> > [    0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6)
> > [    0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
> > [    0.136000] ... version:                2
> > [    0.136000] ... bit width:              40
> > [    0.136000] ... generic registers:      2
> > [    0.136000] ... value mask:             000000ffffffffff
> > [    0.136000] ... max period:             000000007fffffff
> > [    0.136000] ... fixed-purpose events:   3
> > [    0.136000] ... event mask:             0000000700000003
> > [    0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
> > [    0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190)
> 
> 'Curious'.. :/
> 
> x86_perf_event_set_period() only does:
> 
>   wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);
> 
> and hwc->event ends up being:
> 
>   MSR_ARCH_PERFMON_PERFCTR0 + index
> 
> From which we can deduce that index = 0xdf - 0xc1 = 30, which is
> somewhat larger than the max reported number of counters (2).
> 
> Lemme go see how that can happen.

FTR, I tried the very same kernel on Xeon E5, and the issue didn't pop up. 
So it might be somehow specific to the older Core2, or somehow otherwise 
not really completely generic problem.

-- 
Jiri Kosina
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ