linux-kernel - Re: [PATCH v4 1/3] cacheinfo: Add arch specific early level initializer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230510191207.GA18514@ranerica-svr.sc.intel.com>
Date:   Wed, 10 May 2023 12:12:07 -0700
From:   Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
To:     Radu Rendec <rrendec@...hat.com>
Cc:     linux-kernel@...r.kernel.org,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Pierre Gondois <Pierre.Gondois@....com>,
        Sudeep Holla <sudeep.holla@....com>,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v4 1/3] cacheinfo: Add arch specific early level
 initializer

On Wed, Apr 12, 2023 at 02:57:57PM -0400, Radu Rendec wrote:
> This patch gives architecture specific code the ability to initialize
> the cache level and allocate cacheinfo memory early, when cache level
> initialization runs on the primary CPU for all possible CPUs.
> 
> This is part of a patch series that attempts to further the work in
> commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU").
> Previously, in the absence of any DT/ACPI cache info, architecture
> specific cache detection and info allocation for secondary CPUs would
> happen in non-preemptible context during early CPU initialization and
> trigger a "BUG: sleeping function called from invalid context" splat on
> an RT kernel.
> 
> More specifically, this patch adds the early_cache_level() function,
> which is called by fetch_cache_info() as a fallback when the number of
> cache leaves cannot be extracted from DT/ACPI. In the default generic
> (weak) implementation, this new function returns -ENOENT, which
> preserves the original behavior for architectures that do not implement
> the function.
> 
> Since early detection can get the number of cache leaves wrong in some
> cases*, additional logic is added to still call init_cache_level() later
> on the secondary CPU, therefore giving the architecture specific code an
> opportunity to go back and fix the initial guess. Again, the original
> behavior is preserved for architectures that do not implement the new
> function.
> 
> * For example, on arm64, CLIDR_EL1 detection works only when it runs on
>   the current CPU. In other words, a CPU cannot detect the cache depth
>   for any other CPU than itself.
> 
> Signed-off-by: Radu Rendec <rrendec@...hat.com>
> Reviewed-by: Pierre Gondois <pierre.gondois@....com>
> ---
>  drivers/base/cacheinfo.c  | 75 +++++++++++++++++++++++++++------------
>  include/linux/cacheinfo.h |  2 ++
>  2 files changed, 55 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index f3903d002819..d783896c8a1f 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -398,6 +398,11 @@ static void free_cache_attributes(unsigned int cpu)
>  	cache_shared_cpu_map_remove(cpu);
>  }
>  
> +int __weak early_cache_level(unsigned int cpu)
> +{
> +	return -ENOENT;
> +}
> +
>  int __weak init_cache_level(unsigned int cpu)
>  {
>  	return -ENOENT;
> @@ -423,56 +428,82 @@ int allocate_cache_info(int cpu)
>  
>  int fetch_cache_info(unsigned int cpu)
>  {
> -	struct cpu_cacheinfo *this_cpu_ci;
> +	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>  	unsigned int levels = 0, split_levels = 0;
>  	int ret;
>  
>  	if (acpi_disabled) {
>  		ret = init_of_cache_level(cpu);
> -		if (ret < 0)
> -			return ret;
>  	} else {
>  		ret = acpi_get_cache_info(cpu, &levels, &split_levels);
> -		if (ret < 0)
> +		if (!ret) {
> +			this_cpu_ci->num_levels = levels;
> +			/*
> +			 * This assumes that:
> +			 * - there cannot be any split caches (data/instruction)
> +			 *   above a unified cache
> +			 * - data/instruction caches come by pair
> +			 */
> +			this_cpu_ci->num_leaves = levels + split_levels;
> +		}
> +	}
> +
> +	if (ret || !cache_leaves(cpu)) {
> +		ret = early_cache_level(cpu);
> +		if (ret)
>  			return ret;
>  
> -		this_cpu_ci = get_cpu_cacheinfo(cpu);
> -		this_cpu_ci->num_levels = levels;
> -		/*
> -		 * This assumes that:
> -		 * - there cannot be any split caches (data/instruction)
> -		 *   above a unified cache
> -		 * - data/instruction caches come by pair
> -		 */
> -		this_cpu_ci->num_leaves = levels + split_levels;
> +		if (!cache_leaves(cpu))
> +			return -ENOENT;
> +
> +		this_cpu_ci->early_ci_levels = true;
>  	}
> -	if (!cache_leaves(cpu))
> -		return -ENOENT;
>  
>  	return allocate_cache_info(cpu);
>  }
>  
> -int detect_cache_attributes(unsigned int cpu)
> +static inline int init_level_allocate_ci(unsigned int cpu)
>  {
> -	int ret;
> +	unsigned int early_leaves = cache_leaves(cpu);
>  
>  	/* Since early initialization/allocation of the cacheinfo is allowed
>  	 * via fetch_cache_info() and this also gets called as CPU hotplug
>  	 * callbacks via cacheinfo_cpu_online, the init/alloc can be skipped
>  	 * as it will happen only once (the cacheinfo memory is never freed).
> -	 * Just populate the cacheinfo.
> +	 * Just populate the cacheinfo. However, if the cacheinfo has been
> +	 * allocated early through the arch-specific early_cache_level() call,
> +	 * there is a chance the info is wrong (this can happen on arm64). In
> +	 * that case, call init_cache_level() anyway to give the arch-specific
> +	 * code a chance to make things right.
>  	 */
> -	if (per_cpu_cacheinfo(cpu))
> -		goto populate_leaves;
> +	if (per_cpu_cacheinfo(cpu) && !ci_cacheinfo(cpu)->early_ci_levels)
> +		return 0;
>  
>  	if (init_cache_level(cpu) || !cache_leaves(cpu))
>  		return -ENOENT;
>  
> -	ret = allocate_cache_info(cpu);
> +	/*
> +	 * Now that we have properly initialized the cache level info, make
> +	 * sure we don't try to do that again the next time we are called
> +	 * (e.g. as CPU hotplug callbacks).
> +	 */
> +	ci_cacheinfo(cpu)->early_ci_levels = false;
> +
> +	if (cache_leaves(cpu) <= early_leaves)
> +		return 0;
> +

Hi,

I had posted a patchset[1] for x86 that initializes
ci_cacheinfo(cpu)->num_leaves during SMP boot.

This means that early_leaves and a late cache_leaves() are equal but
per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
fetch_cache_info().

I think that we should check here that per_cpu_cacheinfo() has been allocated to
take care of the case in which early and late cache leaves remain the same:

-       if (cache_leaves(cpu) <= early_leaves)
+       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))

Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
last_level_cache_is_valid().

I can post a patch with this fix if it makes sense.

[1]. https://lore.kernel.org/all/20230424001956.21434-3-ricardo.neri-calderon@linux.intel.com/