lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110414080506.GA23965@elte.hu>
Date:	Thu, 14 Apr 2011 10:05:06 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Cyrill Gorcunov <gorcunov@...nvz.org>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>, maciej.rutecki@...il.com,
	Shaun Ruffell <sruffell@...ium.com>,
	Don Zickus <dzickus@...hat.com>, linux-kernel@...r.kernel.org,
	Lin Ming <ming.m.lin@...el.com>
Subject: Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Read
 proper MSR register to catch" and NMIs


* Cyrill Gorcunov <gorcunov@...nvz.org> wrote:

> On Thu, Apr 14, 2011 at 10:47 AM, Ingo Molnar <mingo@...e.hu> wrote:
> >
> > * Cyrill Gorcunov <gorcunov@...nvz.org> wrote:
> >
> >> -     apic_write(APIC_LVTPC, APIC_DM_NMI);
> >>
> >>       handled = x86_pmu.handle_irq(args->regs);
> >>       if (!handled)
> >>               return NOTIFY_DONE;
> >>
> >> +     /*
> >> +      * Unmasking should be done after IRQ handled, otherwise
> >> +      * there is a race between clearing of counter overflow
> >> +      * flag and LTV entry unmasking (which might lead to double
> >> +      * NMIs generation).
> >> +      */
> >> +     apic_write(APIC_LVTPC, APIC_DM_NMI);
> >
> > Here we could leak a masked IRQ through the !handled path. If we got a LVTPC
> > irq we better handle it and unmask the LVTPC unconditionally - regardless of
> > whether we consider it 'handled' or not from the kernel POV ...
> >
> > Thanks,
> >
> >        Ingo
> 
> If there is no counters overflowed I believe we should not poke LVTPC until 
> we sure NMI comes from it (and counter overflow is the only sign that NMI 
> came from LVTPC as far as I may say, and I see also a possibility for race if 
> counter signal reaches LVTPC and it is being processed inside apic chip 
> {which might take some time too before real NMI signal appears in cpu} and as 
> result hard to tell what we get in output -- double nmi again or something 
> else).

Well, we unmasked unconditionally before. If we unmask conditionally now, we 
risk not unmasking. We risk a completely stuck PMU (there wont ever come *any* 
NMI from it if we ever forget to unmask) versus spurious NMIs.

Maybe we can do it - but it will need a lot of testing on a lot of CPU types to 
make sure there's no other CPU quirks in this area ...

So unless the conditional unmasking fixes a real bug (in kgdb or elsewhere) 
lets unmask unconditionally now to fix the P4 regression in .39 - and queue up 
a *separate* patch that moves it even further down and makes it conditional - 
but queue that up for .40.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ