lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0bf199a0-251d-323c-974a-bfd4e26f4cce@arm.com>
Date:   Thu, 2 Jun 2022 16:26:00 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Sudeep Holla <sudeep.holla@....com>, linux-kernel@...r.kernel.org
Cc:     Atish Patra <atishp@...shpatra.org>,
        Atish Patra <atishp@...osinc.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Qing Wang <wangqing@...o.com>,
        linux-arm-kernel@...ts.infradead.org,
        linux-riscv@...ts.infradead.org, Rob Herring <robh+dt@...nel.org>
Subject: Re: [PATCH v3 07/16] arch_topology: Use the last level cache
 information from the cacheinfo

On 25/05/2022 10:14, Sudeep Holla wrote:
> The cacheinfo is now initialised early along with the CPU topology
> initialisation. Instead of relying on the LLC ID information parsed
> separately only with ACPI PPTT elsewhere, migrate to use the similar
> information from the cacheinfo.
> 
> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> parsed separately can now be removed from arch specific code.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@....com>
> ---
>  drivers/base/arch_topology.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 765723448b10..4c486e4e6f2f 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>  		/* not numa in package, lets use the package siblings */
>  		core_mask = &cpu_topology[cpu].core_sibling;
>  	}
> -	if (cpu_topology[cpu].llc_id != -1) {
> +
> +	if (last_level_cache_is_valid(cpu)) {
>  		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
>  			core_mask = &cpu_topology[cpu].llc_sibling;
>  	}
> @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
>  	for_each_online_cpu(cpu) {
>  		cpu_topo = &cpu_topology[cpu];
>  
> -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> +		if (last_level_cache_is_shared(cpu, cpuid)) {
>  			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
>  			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
>  		}

I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks
like that last_level_cache_is_shared() isn't working as
expected.

I instrumented cpu_coregroup_mask() like:

const struct cpumask *cpu_coregroup_mask(int cpu)
{
        const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));

        if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
                core_mask = &cpu_topology[cpu].core_sibling;
                (1)
        }

	(2)

        if (last_level_cache_is_valid(cpu)) {
                if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
                        core_mask = &cpu_topology[cpu].llc_sibling;
                        (3)
        }

        if (IS_ENABLED(CONFIG_SCHED_CLUSTER) &&
            cpumask_subset(core_mask, &cpu_topology[cpu].cluster_sibling))
                core_mask = &cpu_topology[cpu].cluster_sibling;
                (4)

        (5)
        return core_mask;
}

and got:

(A) v3 patch-set:

[   11.561133] (1) cpu_coregroup_mask[0]=0-47
[   11.565670] (2) last_level_cache_is_valid(0)=1
[   11.570587] (3) cpu_coregroup_mask[0]=0    <-- llc_sibling=0 (should be 0-23)
[   11.574833] (4) cpu_coregroup_mask[0]=0-3  <-- Altra hack kicks in!
[   11.579275] (5) cpu_coregroup_mask[0]=0-3

# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
DIE

# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -3
cpu0 0
domain0 00000000,00000000,0000000f
domain1 ffffffff,ffffffff,ffffffff

So the MC domain is missing.

(B) mainline as reference (cpu_coregroup_mask() slightly different):

[   11.585008] (1) cpu_coregroup_mask[0]=0-47
[   11.589544] (3) cpu_coregroup_mask[0]=0-23 <-- !!!
[   11.594079] (5) cpu_coregroup_mask[0]=0-23

# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
MC                                            <-- !!!
DIE

# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -4
cpu0 0
domain0 00000000,00000000,0000000f
domain1 00000000,00000000,00ffffff            <-- !!!
domain2 ffffffff,ffffffff,ffffffff

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ