lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 18 Mar 2019 16:40:09 +0000 From: "Lendacky, Thomas" <Thomas.Lendacky@....com> To: Peter Zijlstra <peterz@...radead.org> CC: "x86@...nel.org" <x86@...nel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Arnaldo Carvalho de Melo <acme@...nel.org>, Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Namhyung Kim <namhyung@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Jiri Olsa <jolsa@...hat.com> Subject: Re: [RFC PATCH v2 2/2] x86/perf/amd: Resolve NMI latency issues when multiple PMCs are active On 3/18/19 5:40 AM, Peter Zijlstra wrote: > On Fri, Mar 15, 2019 at 08:41:00PM +0000, Lendacky, Thomas wrote: >> This issue is different from a previous issue related to perf NMIs that >> was fixed in commit: >> 63e6be6d98e1 ("perf, x86: Catch spurious interrupts after disabling counters") >> >> The difference here is that the NMI latency can contribute to what appear >> to be spurious NMIs during the handling of PMC counter overflow while the >> counter is active as opposed to when the counter is being disabled. > > But could we not somehow merge these two cases? The cause is similar > IIUC. The PMI gets triggered, but then: > > - a previous PMI handler handled our overflow, or > - our event gets disabled. > > and when our PMI triggers it finds nothing to do. Yes, let me look into what we can do here. > >> +/* >> + * Because of NMI latency, if multiple PMC counters are active or other sources >> + * of NMIs are received, the perf NMI handler can handle one or more overflowed >> + * PMC counters outside of the NMI associated with the PMC overflow. If the NMI >> + * doesn't arrive at the LAPIC in time to become a pending NMI, then the kernel >> + * back-to-back NMI support won't be active. This PMC handler needs to take into >> + * account that this can occur, otherwise this could result in unknown NMI >> + * messages being issued. Examples of this is PMC overflow while in the NMI >> + * handler when multiple PMCs are active or PMC overflow while handling some >> + * other source of an NMI. >> + * >> + * Attempt to mitigate this by using the number of active PMCs to determine >> + * whether to return NMI_HANDLED if the perf NMI handler did not handle/reset >> + * any PMCs. The per-CPU perf_nmi_counter variable is set to a minimum of the >> + * number of active PMCs or 2. The value of 2 is used in case an NMI does not >> + * arrive at the LAPIC in time to be collapsed into an already pending NMI. >> + */ >> +static int amd_pmu_handle_irq(struct pt_regs *regs) >> +{ >> + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); >> + int active, handled; >> + >> + active = __bitmap_weight(cpuc->active_mask, X86_PMC_IDX_MAX); >> + handled = x86_pmu_handle_irq(regs); >> + >> + /* >> + * If no counters are active, reset perf_nmi_counter and return >> + * NMI_DONE >> + */ >> + if (!active) { >> + this_cpu_write(perf_nmi_counter, 0); >> + return NMI_DONE; >> + } > > This will actively render 63e6be6d98e1 void I think. Because that can > return !0 while !active -- that's rather the whole point of it. Yup, I didn't take that into account when I changed that. Let me take a closer look at all this along with your suggestion from the other email. I'm traveling this week, so it might be a few days. Thanks, Tom > >> + /* >> + * If a counter was handled, record the number of possible remaining >> + * NMIs that can occur. >> + */ >> + if (handled) { >> + this_cpu_write(perf_nmi_counter, >> + min_t(unsigned int, 2, active)); >> + >> + return handled; >> + } >> + >> + if (!this_cpu_read(perf_nmi_counter)) >> + return NMI_DONE; >> + >> + this_cpu_dec(perf_nmi_counter); >> + >> + return NMI_HANDLED; >> +} > >
Powered by blists - more mailing lists