[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lsq.1414880883.328799341@decadent.org.uk>
Date: Sat, 01 Nov 2014 22:28:03 +0000
From: Ben Hutchings <ben@...adent.org.uk>
To: linux-kernel@...r.kernel.org, stable@...r.kernel.org
CC: akpm@...ux-foundation.org,
"Yasuaki Ishimatsu" <isimatu.yasuaki@...fujitsu.com>,
"Ingo Molnar" <mingo@...nel.org>,
"Prarit Bhargava" <prarit@...hat.com>,
"Toshi Kani" <toshi.kani@...com>,
"Steven Rostedt" <srostedt@...hat.com>,
"Linn Crosetto" <linn@...com>,
"Peter Zijlstra" <peterz@...radead.org>,
"David Rientjes" <rientjes@...gle.com>,
"Borislav Petkov" <bp@...e.de>,
"Wanpeng Li" <wanpeng.li@...ux.intel.com>
Subject: [PATCH 3.2 069/102] sched: Fix unreleased llc_shared_mask bit
during CPU hotplug
3.2.64-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Wanpeng Li <wanpeng.li@...ux.intel.com>
commit 03bd4e1f7265548832a76e7919a81f3137c44fd1 upstream.
The following bug can be triggered by hot adding and removing a large number of
xen domain0's vcpus repeatedly:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group
PGD 5a9d5067 PUD 13067 PMD 0
Oops: 0000 [#3] SMP
[...]
Call Trace:
load_balance
? _raw_spin_unlock_irqrestore
idle_balance
__schedule
schedule
schedule_timeout
? lock_timer_base
schedule_timeout_uninterruptible
msleep
lock_device_hotplug_sysfs
online_store
dev_attr_store
sysfs_write_file
vfs_write
SyS_write
system_call_fastpath
Last level cache shared mask is built during CPU up and the
build_sched_domain() routine takes advantage of it to setup
the sched domain CPU topology.
However, llc_shared_mask is not released during CPU disable,
which leads to an invalid sched domainCPU topology.
This patch fix it by releasing the llc_shared_mask correctly
during CPU disable.
Yasuaki also reported that this can happen on real hardware:
https://lkml.org/lkml/2014/7/22/1018
His case is here:
==
Here is an example on my system.
My system has 4 sockets and each socket has 15 cores and HT is
enabled. In this case, each core of sockes is numbered as
follows:
| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-44, 90-104
Socket#3 | 45-59, 105-119
Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
It means that last level cache of Socket#2 is shared with
CPU#30-44 and 90-104.
When hot-removing socket#2 and #3, each core of sockets is
numbered as follows:
| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30
remains having 0x3fff80000001fffc0000000.
After that, when hot-adding socket#2 and #3, each core of
sockets is numbered as follows:
| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-59
Socket#3 | 90-119
Then llc_shared_mask of CPU#30 becomes
0x3fff8000fffffffc0000000. It means that last level cache of
Socket#2 is shared with CPU#30-59 and 90-104. So the mask has
the wrong value.
Signed-off-by: Wanpeng Li <wanpeng.li@...ux.intel.com>
Tested-by: Linn Crosetto <linn@...com>
Reviewed-by: Borislav Petkov <bp@...e.de>
Reviewed-by: Toshi Kani <toshi.kani@...com>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>
Cc: David Rientjes <rientjes@...gle.com>
Cc: Prarit Bhargava <prarit@...hat.com>
Cc: Steven Rostedt <srostedt@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@...nel.org>
Signed-off-by: Ben Hutchings <ben@...adent.org.uk>
---
arch/x86/kernel/smpboot.c | 3 +++
1 file changed, 3 insertions(+)
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1252,6 +1252,9 @@ static void remove_siblinginfo(int cpu)
for_each_cpu(sibling, cpu_sibling_mask(cpu))
cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
+ for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
+ cpumask_clear_cpu(cpu, cpu_llc_shared_mask(sibling));
+ cpumask_clear(cpu_llc_shared_mask(cpu));
cpumask_clear(cpu_sibling_mask(cpu));
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists