lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 10 Sep 2013 13:53:06 +0200 From: Ingo Molnar <mingo@...nel.org> To: eranian@...il.com Cc: Peter Zijlstra <peterz@...radead.org>, Linus Torvalds <torvalds@...ux-foundation.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Arnaldo Carvalho de Melo <acme@...radead.org>, Thomas Gleixner <tglx@...utronix.de>, Andi Kleen <andi@...stfloor.org> Subject: PEBS bug on HSW: "Unexpected number of pebs records 10" (was: Re: [GIT PULL] perf changes for v3.12) * Stephane Eranian <eranian@...glemail.com> wrote: > Hi, > > > And what was the perf record command line for this crash? AFAICS it wasn't a crash but the WARN_ON() in intel_pmu_drain_pebs_hsw(), at arch/x86/kernel/cpu/perf_event_intel_ds.c:1003. at = (struct pebs_record_hsw *)(unsigned long)ds->pebs_buffer_base; top = (struct pebs_record_hsw *)(unsigned long)ds->pebs_index; n = top - at; if (n <= 0) return; /* * Should not happen, we program the threshold at 1 and do not * set a reset value. */ WARN_ONCE(n > x86_pmu.max_pebs_events, "Unexpected number of pebs records %d\n", n); The command line Linus used was probably close to: perf record -e cycles:pp -g make -j64 bzImage i.e. PEBS precise profiling, call chains, LBR is used to figure out the real instruction, but no '-a' per CPU profiling option, i.e. high frequency per task PMU context switching. Note that AFAIK neither the kernel nor user-space used any TSX extensions, so this is the Haswell PMU in pure compatibility mode. My (wild) guess is that unless all of us missed some subtle race in the PEBS code it's an (unknown?) erratum: the hardware got confused by the high frequency PMU switches, in this particular case where we got a new PMI right after a very short interval was programmed: >> Call Trace: >> <NMI> [<ffffffff815fc637>] dump_stack+0x45/0x56 >> [<ffffffff81051e78>] warn_slowpath_common+0x78/0xa0 >> [<ffffffff81051ee7>] warn_slowpath_fmt+0x47/0x50 >> [<ffffffff8101b051>] intel_pmu_drain_pebs_hsw+0x91/0xa0 >> [<ffffffff8101c5d0>] intel_pmu_handle_irq+0x210/0x390 >> [<ffffffff81604deb>] perf_event_nmi_handler+0x2b/0x50 >> [<ffffffff81604670>] nmi_handle.isra.3+0x80/0x180 >> [<ffffffff81604840>] do_nmi+0xd0/0x310 >> [<ffffffff81603d37>] end_repeat_nmi+0x1e/0x2e >> <<EOE>> [<ffffffff810167df>] perf_events_lapic_init+0x2f/0x40 >> [<ffffffff81016a50>] x86_pmu_enable+0x260/0x310 >> [<ffffffff81111d87>] perf_pmu_enable+0x27/0x30 >> [<ffffffff81112140>] perf_event_context_sched_in+0x80/0xc0 >> [<ffffffff811127eb>] __perf_event_task_sched_in+0x16b/0x180 >> [<ffffffff8107c300>] finish_task_switch+0x70/0xa0 >> [<ffffffff81600f48>] __schedule+0x368/0x7c0 >> [<ffffffff816013c4>] schedule+0x24/0x70 Note that due to per task profiling the default (long, about 1 KHz) interval can get chopped up and can result in a very small period value being reprogrammed at PMU-sched-in time. That kind of high-freq back-to-back activity could, in theory, confuse the PEBS hardware. Or the kernel :-) Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists