[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211020170305.376118-5-ankur.a.arora@oracle.com>
Date: Wed, 20 Oct 2021 10:02:55 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: linux-kernel@...r.kernel.org, linux-mm@...ck.org, x86@...nel.org
Cc: mingo@...nel.org, bp@...en8.de, luto@...nel.org,
akpm@...ux-foundation.org, mike.kravetz@...cle.com,
jon.grimm@....com, kvm@...r.kernel.org, konrad.wilk@...cle.com,
boris.ostrovsky@...cle.com, Ankur Arora <ankur.a.arora@...cle.com>
Subject: [PATCH v2 04/14] x86/asm: add clzero based page clearing
Add clear_page_clzero(), which uses CLZERO as the underlying primitive.
CLZERO skips the memory hierarchy, so this provides a non-polluting
implementation of clear_page(). Available if X86_FEATURE_CLZERO is set.
CLZERO, from the AMD architecture guide (Vol 3, Rev 3.30):
"Clears the cache line specified by the logical address in rAX by
writing a zero to every byte in the line. The instruction uses an
implied non temporal memory type, similar to a streaming store, and
uses the write combining protocol to minimize cache pollution.
CLZERO is weakly-ordered with respect to other instructions that
operate on memory. Software should use an SFENCE or stronger to
enforce memory ordering of CLZERO with respect to other store
instructions.
The CLZERO instruction executes at any privilege level. CLZERO
performs all the segmentation and paging checks that a store of
the specified cache line would perform."
The use-case is similar to clear_page_movnt(), except that
clear_page_clzero() is expected to be more performant.
Cc: jon.grimm@....com
Signed-off-by: Ankur Arora <ankur.a.arora@...cle.com>
---
arch/x86/include/asm/page_64.h | 1 +
arch/x86/lib/clear_page_64.S | 19 +++++++++++++++++++
2 files changed, 20 insertions(+)
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index cfb95069cf9e..3c53f8ef8818 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -44,6 +44,7 @@ void clear_page_orig(void *page);
void clear_page_rep(void *page);
void clear_page_erms(void *page);
void clear_page_movnt(void *page);
+void clear_page_clzero(void *page);
static inline void clear_page(void *page)
{
diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
index 578f40db0716..1cb29a4454e1 100644
--- a/arch/x86/lib/clear_page_64.S
+++ b/arch/x86/lib/clear_page_64.S
@@ -76,3 +76,22 @@ SYM_FUNC_START(clear_page_movnt)
ja .Lstart
ret
SYM_FUNC_END(clear_page_movnt)
+
+/*
+ * Zero a page using clzero (on AMD.)
+ * %rdi - page
+ *
+ * Caller needs to issue a sfence at the end.
+ */
+SYM_FUNC_START(clear_page_clzero)
+ movl $4096,%ecx
+ movq %rdi,%rax
+
+ .p2align 4
+.Liter:
+ clzero
+ addq $0x40, %rax
+ sub $0x40, %ecx
+ ja .Liter
+ ret
+SYM_FUNC_END(clear_page_clzero)
--
2.29.2
Powered by blists - more mailing lists