lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d10116ae-fc21-42e3-8ee0-a68d3bb72425@infradead.org>
Date: Wed, 14 Jan 2026 10:25:10 -0800
From: Randy Dunlap <rdunlap@...radead.org>
To: Lucas Wei <lucaswei@...gle.com>, Catalin Marinas
 <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
 Jonathan Corbet <corbet@....net>
Cc: sjadavani@...gle.com, stable@...r.kernel.org, kernel-team@...roid.com,
 linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] arm64: errata: Workaround for SI L1 downstream
 coherency issue

Hi,
I have a few comments/questions, please.

On 1/14/26 6:52 AM, Lucas Wei wrote:
> When software issues a Cache Maintenance Operation (CMO) targeting a
> dirty cache line, the CPU and DSU cluster may optimize the operation by
> combining the CopyBack Write and CMO into a single combined CopyBack
> Write plus CMO transaction presented to the interconnect (MCN).
> For these combined transactions, the MCN splits the operation into two
> separate transactions, one Write and one CMO, and then propagates the
> write and optionally the CMO to the downstream memory system or external
> Point of Serialization (PoS).
> However, the MCN may return an early CompCMO response to the DSU cluster
> before the corresponding Write and CMO transactions have completed at
> the external PoS or downstream memory. As a result, stale data may be
> observed by external observers that are directly connected to the
> external PoS or downstream memory.
> 
> This erratum affects any system topology in which the following
> conditions apply:
>  - The Point of Serialization (PoS) is located downstream of the
>    interconnect.
>  - A downstream observer accesses memory directly, bypassing the
>    interconnect.
> 
> Conditions:
> This erratum occurs only when all of the following conditions are met:
>  1. Software executes a data cache maintenance operation, specifically,
>     a clean or clean&invalidate by virtual address (DC CVAC or DC
>     CIVAC), that hits on unique dirty data in the CPU or DSU cache.
>     This results in a combined CopyBack and CMO being issued to the
>     interconnect.
>  2. The interconnect splits the combined transaction into separate Write
>     and CMO transactions and returns an early completion response to the
>     CPU or DSU before the write has completed at the downstream memory
>     or PoS.
>  3. A downstream observer accesses the affected memory address after the
>     early completion response is issued but before the actual memory
>     write has completed. This allows the observer to read stale data
>     that has not yet been updated at the PoS or downstream memory.
> 
> The implementation of workaround put a second loop of CMOs at the same
> virtual address whose operation meet erratum conditions to wait until
> cache data be cleaned to PoC. This way of implementation mitigates
> performance penalty compared to purely duplicate original CMO.
> 
> Cc: stable@...r.kernel.org # 6.12.x
> Signed-off-by: Lucas Wei <lucaswei@...gle.com>
> ---
> 
> Changes in v3:
> 
>  1. Fix typos
>  2. Remove 'lkp@...el.com' from commit message
>  3. Keep ARM within a single section
>  4. Remove workaround of #4311569 from `cache_inval_poc()`
> 
> Changes in v2:
> 
>  1. Fixed warning from kernel test robot by changing
>     arm_si_l1_workaround_4311569 to static
>     [Reported-by: kernel test robot <lkp@...el.com>]
> 
> ---
>  Documentation/arch/arm64/silicon-errata.rst |  1 +
>  arch/arm64/Kconfig                          | 19 +++++++++++++
>  arch/arm64/include/asm/assembler.h          | 10 +++++++
>  arch/arm64/kernel/cpu_errata.c              | 31 +++++++++++++++++++++
>  arch/arm64/tools/cpucaps                    |  1 +
>  5 files changed, 62 insertions(+)
> 

> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 93173f0a09c7..89326bb26f48 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1155,6 +1155,25 @@ config ARM64_ERRATUM_3194386
>  
>  	  If unsure, say Y.
>  
> +config ARM64_ERRATUM_4311569
> +	bool "SI L1: 4311569: workaround for premature CMO completion erratum"
> +	default y
> +	help
> +	  This option adds the workaround for ARM SI L1 erratum 4311569.
> +
> +	  The erratum of SI L1 can cause an early response to a combined write
> +	  and cache maintenance operation (WR+CMO) before the operation is fully
> +	  completed to the Point of Serialization (POS).
> +	  This can result in a non-I/O coherent agent observing stale data,
> +	  potentially leading to system instability or incorrect behavior.
> +
> +	  Enabling this option implements a software workaround by inserting a
> +	  second loop of Cache Maintenance Operation (CMO) immediately following the
> +	  end of function to do CMOs. This ensures that the data is correctly serialized
> +	  before the buffer is handed off to a non-coherent agent.
> +
> +	  If unsure, say Y.
> +
>  config CAVIUM_ERRATUM_22375
>  	bool "Cavium erratum 22375, 24313"
>  	default y

[snip]

> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 8cb3b575a031..5c0ab6bfd44a 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -141,6 +141,30 @@ has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
>  	return (ctr_real != sys) && (ctr_raw != sys);
>  }
>  
> +#ifdef CONFIG_ARM64_ERRATUM_4311569
> +static DEFINE_STATIC_KEY_FALSE(arm_si_l1_workaround_4311569);
> +static int __init early_arm_si_l1_workaround_4311569_cfg(char *arg)
> +{
> +	static_branch_enable(&arm_si_l1_workaround_4311569);
> +	pr_info("Enabling cache maintenance workaround for ARM SI-L1 erratum 4311569\n");
> +
> +	return 0;
> +}
> +early_param("arm_si_l1_workaround_4311569", early_arm_si_l1_workaround_4311569_cfg);
> +

It looks like all other errata don't use early_param() -- are they auto-detected?
Could this one be auto-detected?

> +/*
> + * We have some earlier use cases to call cache maintenance operation functions, for example,
> + * dcache_inval_poc() and dcache_clean_poc() in head.S, before making decision to turn on this
> + * workaround. Since the scope of this workaround is limited to non-coherent DMA agents, its
> + * safe to have the workaround off by default.

But it's not off by default...

[snip]

thanks.
-- 
~Randy


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ