[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <80d6238b-223c-e60a-6930-24a981d9dd0c@arm.com>
Date: Tue, 27 Sep 2022 14:06:41 +0100
From: James Morse <james.morse@....com>
To: Shawn Wang <shawnwang@...ux.alibaba.com>, fenghua.yu@...el.com,
reinette.chatre@...el.com
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/resctrl: Clear the staged configs when destroying
schemata list
Hi Shawn,
On 27/09/2022 03:54, Shawn Wang wrote:
> Array staged_config in struct rdt_domain still maintains the original value when
> resctrl is unmounted. If resctrl is mounted with cdp option and then remounted
> without cdp option, field have_new_ctrl in staged_config[CDP_CODE] and
> staged_config[CDP_DATA] will still be true.
staged_config[CDP_DATA] is an array - its always 'true'. I think you mean
staged_config[CDP_DATA].have_new_ctrl, which will still be true because it is only
memset() when the schemata file is written to.
> Since resctrl_arch_update_domains()
> traverses all resctrl_conf_type, it will continue to update CDP_CODE and
> CDP_DATA configurations, which can cause overflow problem.
Only if its called with a stale staged config, and it should only be called when the
schemata file is written to, which would memset() the staged config first.
> The problem can be reproduced by the following commands:
> # A system with 16 usable closids and mba disabled
> mount -t resctrl resctrl -o cdp /sys/fs/resctrl
> mkdir /sys/fs/resctrl/p{1..7}
> umount /sys/fs/resctrl/
> mount -t resctrl resctrl /sys/fs/resctrl
> mkdir /sys/fs/resctrl/p{1..8}
Thanks for the reproducer - but I don't see what could set have_new_ctrl in this sequence.
You can't call apply_config() to set CPUs in the mask without that being set.
Creating a new control group, (your mkdir step) shouldn't touch the hardware at all, as it
should be left in its reset state from the last umount(), or setup.
I can't reproduce this on v6.0-rc7.
Even if I dirty the configuration by writing to the schemata file, I can't reproduce this.
(I have mba enabled, but all this should affect is the number of closid available)
> dmesg will generate the following error:
Which kernel version is this?
> [ 6180.939345] unchecked MSR access error: WRMSR to 0xca0 (tried to write
> 0x00000000000007ff) at rIP: 0xffffffff82249142 (cat_wrmsr+0x32/0x60)
Is 0x7ff the default CBM bitmap for this CPU? Or was it written in a step missing from the
reproducer above?
The rest of this splat isn't helpful as its the result of an IPI...
> [ 6180.951983] Call Trace:
> [ 6180.954516] <IRQ>
> [ 6180.956619] __flush_smp_call_function_queue+0x11d/0x170
> [ 6180.962028] __sysvec_call_function+0x24/0xd0
> [ 6180.966485] sysvec_call_function+0x89/0xc0
> [ 6180.970760] </IRQ>
> [ 6180.972947] <TASK>
> [ 6180.975131] asm_sysvec_call_function+0x16/0x20
> [ 6180.979757] RIP: 0010:cpuidle_enter_state+0xcd/0x400
> [ 6180.984821] Code: 49 89 c5 0f 1f 44 00 00 31 ff e8 1e e5 77 ff 45 84
> ff 74 12 9c 58 f6 c4 02 0f 85 13 03 00 00 31 ff e8 67 70 7d ff fb 45 85
> f6 <0f> 88 75 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d 14 90 49
> [ 6181.003710] RSP: 0018:ffffffff83a03e48 EFLAGS: 00000202
> [ 6181.009028] RAX: ffff943400800000 RBX: 0000000000000001 RCX: 000000000000001f
> [ 6181.016261] RDX: 0000000000000000 RSI: ffffffff83795059 RDI: ffffffff837c101e
> [ 6181.023490] RBP: ffff9434c9352000 R08: 0000059f1cb1a05e R09: 0000000000000008
> [ 6181.030717] R10: 0000000000000001 R11: 0000000000005c66 R12: ffffffff83bbf3a0
> [ 6181.037944] R13: 0000059f1cb1a05e R14: 0000000000000001 R15: 0000000000000000
> [ 6181.045202] ? cpuidle_enter_state+0xb2/0x400
> [ 6181.049678] cpuidle_enter+0x24/0x40
> [ 6181.053370] do_idle+0x1dd/0x260
> [ 6181.056713] cpu_startup_entry+0x14/0x20
> [ 6181.060753] rest_init+0xbb/0xc0
> [ 6181.064097] arch_call_rest_init+0x5/0xa
> [ 6181.068137] start_kernel+0x668/0x691
> [ 6181.071914] secondary_startup_64_no_verify+0xe0/0xeb
> [ 6181.077086] </TASK>
It would be good to know what triggered this IPI. It may not have been
resctrl_arch_update_domains(). This pattern also happens from reset_all_ctrls() which
happens during umount(). (and that would write the default CBM bitmap)
If you can reproduce this easily, could you add dump_stack() to update_config() to see if
any path is setting have_new_ctrl. You aren't writing to the schemata file in your reproducer.
> We fix this issue by clearing the staged configs when destroying schemata list.
> Signed-off-by: Shawn Wang <shawnwang@...ux.alibaba.com>
> Suggested-by: Xin Hao <xhao@...ux.alibaba.com>
If we can work out why you are seeing this, it would need a Fixes tag.
Otherwise I agree it makes sense to make this more robust, but it would need a different
commit message.
Thanks,
James
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index f276aff521e8..b4a817ae83ab 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2127,8 +2127,15 @@ static int schemata_list_create(void)
> static void schemata_list_destroy(void)
> {
> struct resctrl_schema *s, *tmp;
> + struct rdt_domain *dom;
>
> list_for_each_entry_safe(s, tmp, &resctrl_schema_all, list) {
> + /*
> + * Clear staged_config on each domain before schemata list is
> + * destroyed.
> + */
> + list_for_each_entry(dom, &s->res->domains, list)
> + memset(dom->staged_config, 0, sizeof(dom->staged_config));
> list_del(&s->list);
> kfree(s);
> }
Powered by blists - more mailing lists