[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3ce6ef93-2f47-eda3-f974-0cfea7f43766@intel.com>
Date: Thu, 27 Oct 2022 09:18:13 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Dan Williams <dan.j.williams@...el.com>, linux-cxl@...r.kernel.org
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>, nvdimm@...ts.linux.dev,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] memregion: Add cpu_cache_invalidate_memregion()
interface
On 10/25/22 13:05, Dan Williams wrote:
> Users must first call cpu_cache_has_invalidate_memregion() to know whether
> this functionality is available on the architecture. Only enable it on
> x86-64 via the wbinvd() hammer. Hypervisors are not supported as TDX
> guests may trigger a virtualization exception and may need proper handling
> to recover. See:
That sentence doesn't quite parse correctly. Does it need to be "and
may trigger..."?
> This global cache invalidation facility,
> cpu_cache_invalidate_memregion(), is exported to modules since NVDIMM
> and CXL support can be built as a module. However, the intent is that
> this facility is not available outside of specific "device-memory" use
> cases. To that end the API is scoped to a new "DEVMEM" module namespace
> that only applies to the NVDIMM and CXL subsystems.
Oh, thank $DEITY you're trying to limit the amount of code that has
access to this thing.
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 67745ceab0db..b68661d0633b 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -69,6 +69,7 @@ config X86
> select ARCH_ENABLE_THP_MIGRATION if X86_64 && TRANSPARENT_HUGEPAGE
> select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
> select ARCH_HAS_CACHE_LINE_SIZE
> + select ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION if X86_64
What is 64-bit only about this?
I don't expect to have a lot of NVDIMMs or CXL devices on 32-bit
kernels, but it would be nice to remove this if it's not strictly
needed. Or, to add a changelog nugget that says:
Restrict this to X86_64 kernels. It would probably work on 32-
bit, but there is no practical reason to use 32-bit kernels and
no one is testing them.
> select ARCH_HAS_CURRENT_STACK_POINTER
> select ARCH_HAS_DEBUG_VIRTUAL
> select ARCH_HAS_DEBUG_VM_PGTABLE if !X86_PAE
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index 97342c42dda8..8650bb6481a8 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -330,6 +330,21 @@ void arch_invalidate_pmem(void *addr, size_t size)
> EXPORT_SYMBOL_GPL(arch_invalidate_pmem);
> #endif
>
> +#ifdef CONFIG_ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
> +bool cpu_cache_has_invalidate_memregion(void)
> +{
> + return !cpu_feature_enabled(X86_FEATURE_HYPERVISOR);
> +}
> +EXPORT_SYMBOL_NS_GPL(cpu_cache_has_invalidate_memregion, DEVMEM);
> +
> +int cpu_cache_invalidate_memregion(int res_desc)
> +{
> + wbinvd_on_all_cpus();
> + return 0;
> +}
Does this maybe also deserve a:
WARN_ON_ONCE(!cpu_cache_has_invalidate_memregion());
in case one of the cpu_cache_invalidate_memregion() paths missed a
cpu_cache_has_invalidate_memregion() check?
> +/**
> + * cpu_cache_invalidate_memregion - drop any CPU cached data for
> + * memregions described by @res_desc
> + * @res_desc: one of the IORES_DESC_* types
> + *
> + * Perform cache maintenance after a memory event / operation that
> + * changes the contents of physical memory in a cache-incoherent manner.
> + * For example, device memory technologies like NVDIMM and CXL have
> + * device secure erase, or dynamic region provision features where such
> + * semantics.
s/where/with/ ?
> + * Limit the functionality to architectures that have an efficient way
> + * to writeback and invalidate potentially terabytes of memory at once.
> + * Note that this routine may or may not write back any dirty contents
> + * while performing the invalidation. It is only exported for the
> + * explicit usage of the NVDIMM and CXL modules in the 'DEVMEM' symbol
> + * namespace.
> + *
> + * Returns 0 on success or negative error code on a failure to perform
> + * the cache maintenance.
> + */
WBINVD is a scary beast. But, there's also no better alternative in the
architecture. I don't think any of my comments above are deal breakers,
so from the x86 side:
Acked-by: Dave Hansen <dave.hansen@...ux.intel.com>
Powered by blists - more mailing lists