linux-kernel - [PATCH V2] x86: mce: fix kernel panic when check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180228203131.3176-1-kkamagui@gmail.com>
Date:   Thu,  1 Mar 2018 05:31:31 +0900
From:   Seunghun Han <kkamagui@...il.com>
To:     Tony Luck <tony.luck@...el.com>, Borislav Petkov <bp@...en8.de>
Cc:     linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Seunghun Han <kkamagui@...il.com>
Subject: [PATCH V2] x86: mce: fix kernel panic when check_interval is changed

I am Seunghun Han and a senior security researcher at National Security
Research Institute of South Korea.

I found a security issue which can make kernel panic in userspace. After
analyzing the issue carefully, I found that MCE driver in the kernel has a
problem which can be occurred in SMP environment.

The check_interval file in
/sys/devices/system/machinecheck/machinecheck<cpu number> directory is a
global timer value for MCE polling. If it is changed by one CPU, MCE driver
in kernel calls mce_restart() function and broadcasts the event to other
CPUs to delete and restart MCE polling timer.

The __mcheck_cpu_init_timer() function which is called by mce_restart()
function initializes the mce_timer variable, and the "lock" in mce_timer is
also reinitialized. If more than one CPU write a specific value to
check_interval file concurrently, one can initialize the "lock" in mce_timer
while the others are handling "lock" in mce_timer. This problem causes some
synchronization errors such as kernel panic and kernel hang.

It is a security problem because the attacker can make kernel panic by
writing a value to the check_interval file in userspace, and it can be used
for Denial-of-Service (DoS) attack.

To fix this problem, I added a mce_sysfs_mutex to serialize requests.

Signed-off-by: Seunghun Han <kkamagui@...il.com>
---
Changes since v1: add mce_sysfs_mutex according to review result.

 arch/x86/kernel/cpu/mcheck/mce.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 706584681a4c..bee0795a3b8c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -55,6 +55,7 @@
 #include "mce-internal.h"

 static DEFINE_MUTEX(mce_log_mutex);
+static DEFINE_MUTEX(mce_sysfs_mutex);

 #define CREATE_TRACE_POINTS
 #include <trace/events/mce.h>
@@ -2045,8 +2046,11 @@ static void mce_enable_ce(void *all)
 		return;
 	cmci_reenable();
 	cmci_recheck();
-	if (all)
+	if (all) {
+		mutex_lock(&mce_sysfs_mutex);
 		__mcheck_cpu_init_timer();
+		mutex_unlock(&mce_sysfs_mutex);
+	}
 }

 static struct bus_type mce_subsys = {
@@ -2132,8 +2136,14 @@ static ssize_t store_int_with_restart(struct device *s,
 				      struct device_attribute *attr,
 				      const char *buf, size_t size)
 {
+	unsigned long old_check_interval = check_interval;
 	ssize_t ret = device_store_int(s, attr, buf, size);
+
+	if (check_interval == old_check_interval)
+		return ret;
+	mutex_lock(&mce_sysfs_mutex);
 	mce_restart();
+	mutex_unlock(&mce_sysfs_mutex);
 	return ret;
 }

-- 
2.16.2