lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dd0342ae-7474-44c4-ac80-d12b69f093c7@intel.com>
Date: Thu, 22 Aug 2024 09:33:46 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Peter Newman <peternewman@...gle.com>, Fenghua Yu <fenghua.yu@...el.com>
CC: Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
	Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
	<x86@...nel.org>, "H . Peter Anvin" <hpa@...or.com>, Shaopeng Tan
	<tan.shaopeng@...itsu.com>, James Morse <james.morse@....com>, Babu Moger
	<babu.moger@....com>, Tony Luck <tony.luck@...el.com>, Maciej Wieczor-Retman
	<maciej.wieczor-retman@...el.com>, <linux-kernel@...r.kernel.org>,
	<eranian@...gle.com>, <irogers@...gle.com>, <namhyung@...gle.com>
Subject: Re: [PATCH] x86/resctrl: Fix arch_mbm_* array overrun on SNC

Hi Peter,

On 8/1/24 11:16 AM, Reinette Chatre wrote:
> On 7/22/24 1:46 PM, Peter Newman wrote:
>> When using resctrl on systems with Sub-NUMA Clustering enabled,
>> monitoring groups may be allocated RMID values which would overrun the
>> arch_mbm_{local,total} arrays.
>>
>> This is due to inconsistencies in whether the SNC-adjusted num_rmid
>> value or the unadjusted value in resctrl_arch_system_num_rmid_idx() is
>> used. The num_rmid value for the L3 resource is currently:
>>
>>   resctrl_arch_system_num_rmid_idx() / snc_nodes_per_l3_cache
>>
>> As a simple fix, make resctrl_arch_system_num_rmid_idx() return the
>> SNC-adjusted, L3 num_rmid value on x86.
>>
> 
> Thank you very much for finding, root-causing, and providing a fix for
> the issue.
> 
>> Fixes: e13db55b5a0d ("x86/resctrl: Introduce snc_nodes_per_l3_cache")
>> Signed-off-by: Peter Newman <peternewman@...gle.com>
>> ---
>>   arch/x86/include/asm/resctrl.h     | 6 ------
>>   arch/x86/kernel/cpu/resctrl/core.c | 8 ++++++++
>>   include/linux/resctrl.h            | 3 +++
>>   3 files changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
>> index 12dbd2588ca7..8b1b6ce1e51b 100644
>> --- a/arch/x86/include/asm/resctrl.h
>> +++ b/arch/x86/include/asm/resctrl.h
>> @@ -156,12 +156,6 @@ static inline void resctrl_sched_in(struct task_struct *tsk)
>>           __resctrl_sched_in(tsk);
>>   }
>> -static inline u32 resctrl_arch_system_num_rmid_idx(void)
>> -{
>> -    /* RMID are independent numbers for x86. num_rmid_idx == num_rmid */
>> -    return boot_cpu_data.x86_cache_max_rmid + 1;
>> -}
>> -
>>   static inline void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid)
>>   {
>>       *rmid = idx;
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 1930fce9dfe9..8591d53c144b 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -119,6 +119,14 @@ struct rdt_hw_resource rdt_resources_all[] = {
>>       },
>>   };
>> +u32 resctrl_arch_system_num_rmid_idx(void)
>> +{
>> +    struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +
>> +    /* RMID are independent numbers for x86. num_rmid_idx == num_rmid */
>> +    return r->num_rmid;
>> +}
>> +
>>   /*
>>    * cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
>>    * as they do not have CPUID enumeration support for Cache allocation.
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index b0875b99e811..43ac241471b3 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -248,6 +248,9 @@ struct resctrl_schema {
>>   /* The number of closid supported by this resource regardless of CDP */
>>   u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
>> +
>> +u32 resctrl_arch_system_num_rmid_idx(void);
>> +
> 
> nit: the additional empty lines are unnecessary.
> 
>>   int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
>>   /*
> 
> | Reviewed-by: Reinette Chatre <reinette.chatre@...el.com>
> 

It would be great if this fix can be included in the kernel release.
Could you please send V2 with nit and tag applied to be ready for
inclusion?

Thank you

Reinette




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ