lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 27 May 2009 12:07:50 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>
Cc:	linux-kernel@...r.kernel.org, hpa@...or.com, x86@...nel.org,
	Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH] x86: MCE: Fix for mce_panic_timeout

Hidetoshi Seto <seto.hidetoshi@...fujitsu.com> writes:

> This fixes:

Thanks I had already fixed it on my own.

Updated patch appended.
>
>  - In case of panic_timeout > 0 and mce_bootlog == 0.
>    System should reboot after panic, but it doesn't on mce panic because
>    current mce code overwrite panic_timeout to 0.

Nope, with bootlog==0 it should _not_ automatically reboot on panic.
Automatic rebooting makes mainly sense with boot logging, otherwise
you will likely lose the information. Or at least the kernel 
cannot know if you lose information or not so it has to err on 
the safe side.

I changed it now to only override with panic_timeout == 0,
as in the user didn't set anything,
that's probably the most sensible semantics anyways.

-Andi

---

x86: MCE: Default to panic timeout for machine checks v3

Fatal machine checks can be logged to disk after boot, but only if
the system did a warm reboot. That's unfortunately difficult with the
default panic behaviour, which waits forever and the admin has to
press the power button because modern systems usually miss a reset button.
This clears the machine checks in the registers and make
it impossible to log them.

This patch changes the default for machine check panic to always
reboot after 30s. Then the mce can be successfully logged after
reboot.

I believe this will improve machine check experience for any 
system running the X server.

This is dependent on successfull boot logging of MCEs. This currently
only works on Intel systems, on AMD there are quite a lot of systems
around which leave junk in the machine check registers after boot,
so it's disabled here. These systems will continue to default
to endless waiting panic.

v2: Only force panic timeout when it's shorter (H.Seto)
v3: Only panic when there is no earlier timeout or it's not zero
(based on comment H.Seto)

Signed-off-by: Andi Kleen <ak@...ux.intel.com>

---
 arch/x86/kernel/cpu/mcheck/mce.c |    7 +++++++
 1 file changed, 7 insertions(+)

Index: linux/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce.c	2009-05-27 11:59:03.000000000 +0200
+++ linux/arch/x86/kernel/cpu/mcheck/mce.c	2009-05-27 12:01:07.000000000 +0200
@@ -82,6 +82,7 @@
 static int			rip_msr;
 static int			mce_bootlog = -1;
 static int			monarch_timeout = -1;
+static int			mce_panic_timeout;
 
 static char			trigger[128];
 static char			*trigger_argv[2] = { trigger, NULL };
@@ -203,6 +204,8 @@
 	local_irq_enable();
 	while (timeout-- > 0)
 		udelay(1);
+	if (panic_timeout == 0)
+		panic_timeout = mce_panic_timeout;
 	panic("Panicing machine check CPU died");
 }
 
@@ -240,6 +243,8 @@
 		printk(KERN_EMERG "Some CPUs didn't answer in synchronization\n");
 	if (exp)
 		printk(KERN_EMERG "Machine check: %s\n", exp);
+	if (panic_timeout == 0)
+		panic_timeout = mce_panic_timeout;
 	panic(msg);
 }
 
@@ -1100,6 +1105,8 @@
 	}
 	if (monarch_timeout < 0)
 		monarch_timeout = 0;
+	if (mce_bootlog != 0)
+		mce_panic_timeout = 30;
 }
 
 static void __cpuinit mce_ancient_init(struct cpuinfo_x86 *c)


-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ