[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110325093228.GB13640@elte.hu>
Date: Fri, 25 Mar 2011 10:32:28 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Andi Kleen <andi@...stfloor.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jack Steiner <steiner@....com>,
Jan Beulich <JBeulich@...ell.com>,
Borislav Petkov <bp@...64.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Nick Piggin <npiggin@...nel.dk>,
"x86@...nel.org" <x86@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...hat.com>, tee@....com,
Nikanth Karthikesan <knikanth@...e.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit_lock
if possible
* Eric Dumazet <eric.dumazet@...il.com> wrote:
> Le vendredi 25 mars 2011 à 00:56 +0100, Andi Kleen a écrit :
> > > never EVER seen any good explanation of why that particular sh*t
> > > argument would b true. It seems to be purely about politics, where
> > > some idiotic vendor (namely HP) has convinced Intel that they really
> > > need it. To the point where some engineers seem to have bought into
> > > the whole thing and actually believe that fairy tale ("firmware can do
> > > better" - hah! They must be feeding people some bad drugs at the
> > > cafeteria)
> >
> > For the record I don't think it's a good idea for the BIOS to do
> > this (and I'm not aware of any engineer who does),
> > but I think Linux should do better than just disabling PMU use when
> > this happens.
> >
> > However I suspect taking over SCI would cause endless problems
> > and is very likely not a good idea.
>
> I tried many different changes in BIOS and all failed (the machine is
> damn slow at boot, this takes age).
>
> I am stuck :(
Could you please try the patch below?
Thanks,
Ingo
------------------->
>From 14df27334ac47a5cec67fb2238d14499346acc38 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@...e.hu>
Date: Fri, 25 Mar 2011 10:24:23 +0100
Subject: [PATCH] perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue
Eric Dumazet reported that hardware PMU events do not work on his
system, due to the BIOS corrupting PMU state:
Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)
Linus suggested that we continue in the face of such BIOS-induced CPU
state corruption:
http://lkml.org/lkml/2011/3/24/608
Such BIOSes will have to be fixed - developers rely on a working and fully
capable PMU and BIOS interfering with CPU state is simply not acceptable.
So this patch changes perf to continue when it detects such BIOS
interaction, some hardware events may be unreliable due to the BIOS writing
and re-writing them - there's not much the kernel can do about that.
Reported-by: Eric Dumazet <eric.dumazet@...il.com>
Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Arnaldo Carvalho de Melo <acme@...hat.com>
Cc: Frederic Weisbecker <fweisbec@...il.com>
Cc: Mike Galbraith <efault@....de>
Cc: Steven Rostedt <rostedt@...dmis.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@...e.hu>
---
arch/x86/kernel/cpu/perf_event.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index ec46eea..eb00677 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -500,12 +500,17 @@ static bool check_hw_exists(void)
return true;
bios_fail:
- printk(KERN_CONT "Broken BIOS detected, using software events only.\n");
+ /*
+ * We still allow the PMU driver to operate:
+ */
+ printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n");
printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg, val);
- return false;
+
+ return true;
msr_fail:
printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
+
return false;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists