[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1227788198.4454.1498.camel@twins>
Date:	Thu, 27 Nov 2008 13:16:38 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	eranian@...il.com
Cc:	Andi Kleen <andi@...stfloor.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	mingo@...e.hu, x86@...nel.org,
	Stephen Rothwell <sfr@...b.auug.org.au>, aris@...hat.com,
	Cyrill Gorcunov <gorcunov@...il.com>, marco@...ux-mips.com
Subject: Re: [patch 05/24] perfmon: X86 generic code (x86)
On Thu, 2008-11-27 at 13:04 +0100, stephane eranian wrote:
> Peter,
> 
> On Thu, Nov 27, 2008 at 12:52 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Thu, 2008-11-27 at 12:35 +0100, stephane eranian wrote:
> >> On Thu, Nov 27, 2008 at 12:31 PM, Andi Kleen <andi@...stfloor.org> wrote:
> >> >> The only reason why I have to deal with NMI is not so much to allow
> >> >> for profiling irq-off regions but because I have to share the PMU with
> >> >> the NMI watchdog. Otherwise I'd have to fail  or disable the NMI watchdog
> >> >> on the fly.
> >> >
> >> > The NMI watchdog is now off by default so failing with it enabled
> >> > is fine.
> >>
> >> Yes, but most likely it is on in distro kernels.
> >
> > So? You can disable it on the fly when there is a perfmon user.
> >
> Yes, you can. There is clearly an interface to do this. I think this is the
> best solution. I know it can work because it experimented with this approach
> no later than last month. But I ran into a bug which I reported on LKML. I did
> not provide a patch because I  did not fully understand the connection to
> suspend/resume.
> 
> The bug has to do with some obscure suspend/resume sequence in:
> 
> void setup_apic_nmi_watchdog(void *unused)
> {
>         if (__get_cpu_var(wd_enabled))
>                 return;
> 
>         /* cheap hack to support suspend/resume */
>         /* if cpu0 is not active neither should the other cpus */
>         if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0)
>                 return;
> 
> Basically, when you re-enable the NMI watchdog, it is not always re-enabled
> correctly on all CPUs, it depends on the order if which they process the IPI.
Hmm, either we loose that bit and fix the suspend/resume bit properly,
or we can send the IPIs one by one in the correct order ;-)
Dunno, CC'ed all the folks who touched it last.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
