lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <04ed0569-5026-9c4f-b09f-3e8798d5b551@huawei.com>
Date: Mon, 19 Aug 2024 15:03:21 +0800
From: Yicong Yang <yangyicong@...wei.com>
To: Dietmar Eggemann <dietmar.eggemann@....com>
CC: <catalin.marinas@....com>, <will@...nel.org>, <sudeep.holla@....com>,
	<tglx@...utronix.de>, <peterz@...radead.org>, <mpe@...erman.id.au>,
	<linux-arm-kernel@...ts.infradead.org>, <mingo@...hat.com>, <bp@...en8.de>,
	<dave.hansen@...ux.intel.com>, <yangyicong@...ilicon.com>,
	<linuxppc-dev@...ts.ozlabs.org>, <x86@...nel.org>,
	<linux-kernel@...r.kernel.org>, <gregkh@...uxfoundation.org>,
	<rafael@...nel.org>, <jonathan.cameron@...wei.com>,
	<prime.zeng@...ilicon.com>, <linuxarm@...wei.com>, <xuwei5@...wei.com>,
	<guohanjun@...wei.com>
Subject: Re: [PATCH v5 3/4] arm64: topology: Support SMT control on ACPI based
 system

On 2024/8/16 23:55, Dietmar Eggemann wrote:
> On 06/08/2024 10:53, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@...ilicon.com>
>>
>> For ACPI we'll build the topology from PPTT and we cannot directly
>> get the SMT number of each core. Instead using a temporary xarray
>> to record the SMT number of each core when building the topology
>> and we can know the largest SMT number in the system. Then we can
>> enable the support of SMT control.
>>
>> Signed-off-by: Yicong Yang <yangyicong@...ilicon.com>
>> ---
>>  arch/arm64/kernel/topology.c | 24 ++++++++++++++++++++++++
>>  1 file changed, 24 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index 1a2c72f3e7f8..f72e1e55b05e 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -15,8 +15,10 @@
>>  #include <linux/arch_topology.h>
>>  #include <linux/cacheinfo.h>
>>  #include <linux/cpufreq.h>
>> +#include <linux/cpu_smt.h>
>>  #include <linux/init.h>
>>  #include <linux/percpu.h>
>> +#include <linux/xarray.h>
>>  
>>  #include <asm/cpu.h>
>>  #include <asm/cputype.h>
>> @@ -43,11 +45,16 @@ static bool __init acpi_cpu_is_threaded(int cpu)
>>   */
>>  int __init parse_acpi_topology(void)
>>  {
>> +	int thread_num, max_smt_thread_num = 1;
>> +	struct xarray core_threads;
>>  	int cpu, topology_id;
>> +	void *entry;
>>  
>>  	if (acpi_disabled)
>>  		return 0;
>>  
>> +	xa_init(&core_threads);
>> +
>>  	for_each_possible_cpu(cpu) {
>>  		topology_id = find_acpi_cpu_topology(cpu, 0);
>>  		if (topology_id < 0)
>> @@ -57,6 +64,20 @@ int __init parse_acpi_topology(void)
>>  			cpu_topology[cpu].thread_id = topology_id;
>>  			topology_id = find_acpi_cpu_topology(cpu, 1);
>>  			cpu_topology[cpu].core_id   = topology_id;
>> +
>> +			entry = xa_load(&core_threads, topology_id);
>> +			if (!entry) {
>> +				xa_store(&core_threads, topology_id,
>> +					 xa_mk_value(1), GFP_KERNEL);
>> +			} else {
>> +				thread_num = xa_to_value(entry);
>> +				thread_num++;
>> +				xa_store(&core_threads, topology_id,
>> +					 xa_mk_value(thread_num), GFP_KERNEL);
>> +
>> +				if (thread_num > max_smt_thread_num)
>> +					max_smt_thread_num = thread_num;
>> +			}
> 
> So the xarray contains one element for each core_id with the information
> how often the core_id occurs? I assume you have to iterate over all
> possible CPUs since you don't know which logical CPUs belong to the same
> core_id.
> 

Each xarray element counts the thread number of a certain core id. so the logic is like below:
1. if the "core id" entry doesn't exists, then we're accessing this core for the 1st time. create
   one and make the thread number to 1
2. otherwise increment the thread number of "core id" this cpu belongs (PPTT already
   told us which core this CPU belongs to). Update the max_smt_thread_num if necessary.

Then we can know max_smt_thread_num by meanwhile iterating the PPTT table and
build the topology for all the possible CPUs.

Otherwise we need to do a second scan for the max thread number after built the
topology. This way is implemented in v1 and it's complained about the overhead on large
scale systems since we need to loop the CPUs twice.

>>  		} else {
>>  			cpu_topology[cpu].thread_id  = -1;
>>  			cpu_topology[cpu].core_id    = topology_id;
>> @@ -67,6 +88,9 @@ int __init parse_acpi_topology(void)
>>  		cpu_topology[cpu].package_id = topology_id;
>>  	}
>>  
>> +	cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
>> +
>> +	xa_destroy(&core_threads);
>>  	return 0;
>>  }
>>  #endif
> 
> Tested on ThunderX2:
> 
> $ cat /proc/schedstat | head -6 | tail -4 | awk '{ print $1, $2 }'
> cpu0 0
> domain0 00000000,00000000,00000000,00000000,00000001,00000001,00000001,00000001
>                                                    ^        ^        ^        ^
> domain1 00000000,00000000,00000000,00000000,ffffffff,ffffffff,ffffffff,ffffffff
> domain2 ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff
> 
> detecting 'max_smt_thread_num = 4' correctly.
> 

Thanks for the testing. ok for a tag?

Thanks.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ