lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 25 Mar 2011 17:16:28 +0100
From:	Robert Richter <robert.richter@....com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Eric Dumazet <eric.dumazet@...il.com>,
	Andi Kleen <andi@...stfloor.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jack Steiner <steiner@....com>,
	Jan Beulich <JBeulich@...ell.com>,
	Borislav Petkov <bp@...64.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...nel.dk>,
	"x86@...nel.org" <x86@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>, "tee@....com" <tee@....com>,
	Nikanth Karthikesan <knikanth@...e.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH RFC] x86: avoid atomic operation in
 test_and_set_bit_lock if possible

On 25.03.11 05:32:28, Ingo Molnar wrote:
> 
> * Eric Dumazet <eric.dumazet@...il.com> wrote:
> 
> > Le vendredi 25 mars 2011 à 00:56 +0100, Andi Kleen a écrit :
> > > > never EVER seen any good explanation of why that particular sh*t
> > > > argument would b true. It seems to be purely about politics, where
> > > > some idiotic vendor (namely HP) has convinced Intel that they really
> > > > need it. To the point where some engineers seem to have bought into
> > > > the whole thing and actually believe that fairy tale ("firmware can do
> > > > better" - hah! They must be feeding people some bad drugs at the
> > > > cafeteria)
> > > 
> > > For the record I don't think it's a good idea for the BIOS to do 
> > > this (and I'm not aware of any engineer who does),  
> > > but I think Linux should do better than just disabling PMU use when 
> > > this happens.
> > > 
> > > However I suspect taking over SCI would cause endless problems
> > > and is very likely not a good idea.
> > 
> > I tried many different changes in BIOS and all failed (the machine is
> > damn slow at boot, this takes age).
> > 
> > I am stuck :(
> 
> Could you please try the patch below?
> 
> Thanks,
> 
> 	Ingo
> 
> ------------------->
> From 14df27334ac47a5cec67fb2238d14499346acc38 Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@...e.hu>
> Date: Fri, 25 Mar 2011 10:24:23 +0100
> Subject: [PATCH] perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue
> 
> Eric Dumazet reported that hardware PMU events do not work on his
> system, due to the BIOS corrupting PMU state:
> 
>     Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
>     [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)
> 
> Linus suggested that we continue in the face of such BIOS-induced CPU
> state corruption:
> 
>    http://lkml.org/lkml/2011/3/24/608
> 
> Such BIOSes will have to be fixed - developers rely on a working and fully
> capable PMU and BIOS interfering with CPU state is simply not acceptable.
> 
> So this patch changes perf to continue when it detects such BIOS
> interaction, some hardware events may be unreliable due to the BIOS writing
> and re-writing them - there's not much the kernel can do about that.
> 
> Reported-by: Eric Dumazet <eric.dumazet@...il.com>
> Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> Cc: Arnaldo Carvalho de Melo <acme@...hat.com>
> Cc: Frederic Weisbecker <fweisbec@...il.com>
> Cc: Mike Galbraith <efault@....de>
> Cc: Steven Rostedt <rostedt@...dmis.org>
> LKML-Reference: <new-submission>
> Signed-off-by: Ingo Molnar <mingo@...e.hu>
> ---
>  arch/x86/kernel/cpu/perf_event.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index ec46eea..eb00677 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -500,12 +500,17 @@ static bool check_hw_exists(void)
>  	return true;
>  
>  bios_fail:
> -	printk(KERN_CONT "Broken BIOS detected, using software events only.\n");
> +	/*
> +	 * We still allow the PMU driver to operate:
> +	 */
> +	printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n");
>  	printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg, val);
> -	return false;
> +
> +	return true;

Ingo, you jump out the loop here. This skips checks on other registers
and msr access. And if we want to continue anyway, checking msr access
becomes more important as the bios may block it.

-Robert

>  
>  msr_fail:
>  	printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
> +
>  	return false;
>  }
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ