[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1664247269-41295-1-git-send-email-shawnwang@linux.alibaba.com>
Date: Tue, 27 Sep 2022 10:54:29 +0800
From: Shawn Wang <shawnwang@...ux.alibaba.com>
To: fenghua.yu@...el.com, reinette.chatre@...el.com
Cc: james.morse@....com, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org,
hpa@...or.com, linux-kernel@...r.kernel.org
Subject: [PATCH] x86/resctrl: Clear the staged configs when destroying schemata list
Array staged_config in struct rdt_domain still maintains the original value when
resctrl is unmounted. If resctrl is mounted with cdp option and then remounted
without cdp option, field have_new_ctrl in staged_config[CDP_CODE] and
staged_config[CDP_DATA] will still be true. Since resctrl_arch_update_domains()
traverses all resctrl_conf_type, it will continue to update CDP_CODE and
CDP_DATA configurations, which can cause overflow problem.
The problem can be reproduced by the following commands:
# A system with 16 usable closids and mba disabled
mount -t resctrl resctrl -o cdp /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..7}
umount /sys/fs/resctrl/
mount -t resctrl resctrl /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..8}
dmesg will generate the following error:
[ 6180.939345] unchecked MSR access error: WRMSR to 0xca0 (tried to write
0x00000000000007ff) at rIP: 0xffffffff82249142 (cat_wrmsr+0x32/0x60)
[ 6180.951983] Call Trace:
[ 6180.954516] <IRQ>
[ 6180.956619] __flush_smp_call_function_queue+0x11d/0x170
[ 6180.962028] __sysvec_call_function+0x24/0xd0
[ 6180.966485] sysvec_call_function+0x89/0xc0
[ 6180.970760] </IRQ>
[ 6180.972947] <TASK>
[ 6180.975131] asm_sysvec_call_function+0x16/0x20
[ 6180.979757] RIP: 0010:cpuidle_enter_state+0xcd/0x400
[ 6180.984821] Code: 49 89 c5 0f 1f 44 00 00 31 ff e8 1e e5 77 ff 45 84
ff 74 12 9c 58 f6 c4 02 0f 85 13 03 00 00 31 ff e8 67 70 7d ff fb 45 85
f6 <0f> 88 75 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d 14 90 49
[ 6181.003710] RSP: 0018:ffffffff83a03e48 EFLAGS: 00000202
[ 6181.009028] RAX: ffff943400800000 RBX: 0000000000000001 RCX: 000000000000001f
[ 6181.016261] RDX: 0000000000000000 RSI: ffffffff83795059 RDI: ffffffff837c101e
[ 6181.023490] RBP: ffff9434c9352000 R08: 0000059f1cb1a05e R09: 0000000000000008
[ 6181.030717] R10: 0000000000000001 R11: 0000000000005c66 R12: ffffffff83bbf3a0
[ 6181.037944] R13: 0000059f1cb1a05e R14: 0000000000000001 R15: 0000000000000000
[ 6181.045202] ? cpuidle_enter_state+0xb2/0x400
[ 6181.049678] cpuidle_enter+0x24/0x40
[ 6181.053370] do_idle+0x1dd/0x260
[ 6181.056713] cpu_startup_entry+0x14/0x20
[ 6181.060753] rest_init+0xbb/0xc0
[ 6181.064097] arch_call_rest_init+0x5/0xa
[ 6181.068137] start_kernel+0x668/0x691
[ 6181.071914] secondary_startup_64_no_verify+0xe0/0xeb
[ 6181.077086] </TASK>
We fix this issue by clearing the staged configs when destroying schemata list.
Signed-off-by: Shawn Wang <shawnwang@...ux.alibaba.com>
Suggested-by: Xin Hao <xhao@...ux.alibaba.com>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f276aff521e8..b4a817ae83ab 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2127,8 +2127,15 @@ static int schemata_list_create(void)
static void schemata_list_destroy(void)
{
struct resctrl_schema *s, *tmp;
+ struct rdt_domain *dom;
list_for_each_entry_safe(s, tmp, &resctrl_schema_all, list) {
+ /*
+ * Clear staged_config on each domain before schemata list is
+ * destroyed.
+ */
+ list_for_each_entry(dom, &s->res->domains, list)
+ memset(dom->staged_config, 0, sizeof(dom->staged_config));
list_del(&s->list);
kfree(s);
}
--
2.27.0
Powered by blists - more mailing lists