lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130916162926.GA12926@twins.programming.kicks-ass.net>
Date:	Mon, 16 Sep 2013 18:29:26 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	eranian@...il.com, Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andi Kleen <andi@...stfloor.org>
Subject: Re: PEBS bug on HSW: "Unexpected number of pebs records 10" (was:
 Re: [GIT PULL] perf changes for v3.12)

On Mon, Sep 16, 2013 at 05:41:46PM +0200, Ingo Molnar wrote:
> 
> * Stephane Eranian <eranian@...glemail.com> wrote:
> 
> > Hi,
> > 
> > Some updates on this problem.
> > I have been running tests all week-end long on my HSW.
> > I can reproduce the problem. What I know:
> > 
> > - It is not linked with callchain
> > - The extra entries are valid
> > - The reset values are still zeroes
> > - The problem does not happen on SNB with the same test case
> > - The PMU state looks sane when that happens.
> > - The problem occurs even when restricting to one CPU/core (taskset -c 0-3)
> > 
> > So it seems like the threshold is ignored. But I don't understand where 
> > there reset values are coming from. So it looks more like a bug in 
> > micro-code where under certain circumstances multiple entries get 
> > written.
> 
> Either multiple entries are written, or the PMI/NMI is not asserted as it 
> should be?

No, both :-)

> > Something must be happening with the interrupt or HT. I will disable HT 
> > next and also disable the NMI watchdog.
> 
> Yes, interaction with the NMI watchdog events might also be possible.
> 
> If it's truly just the threshold that is broken occasionally in a 
> statistically insignificant manner then the bug is relatively benign and 
> we could work it around in the kernel by ignoring excess entries.
> 
> In that case we should probably not annoy users with the scary kernel 
> warning and instead increase a debug count somewhere so that it's still 
> detectable.

Its not just a broken threshold. When a PEBS event happens it can re-arm
itself but only if you program a RESET value !0. We don't do that, so
each counter should only ever fire once.

We must do this because PEBS is broken on NHM+ in that the
pebs_record::status is a direct copy of the overflow status field at
time of the assist and if you use the RESET thing nothing will clear the
status bits and you cannot demux the PEBS events back to the event that
generated them.

Worse, since its the overflow that arms the assist, and the assist
happens at some undefined amount of cycles after this event it is
possible for another assist to happen first.

That is, suppose both CNT0 and CNT1 have PEBS enabled and CNT0 overflows
first it is possible to find the CNT1 entry first in the buffer with
both of them having status := 0x03.

Complete and utter trainwreck.

This is why we have a threshold of 1 and use NMI for PMI even for pure
PEBS, it minimizes the complete clusterfuck described above.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ