[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240429104330.3636113-3-dmitrii.kuvaiskii@intel.com>
Date: Mon, 29 Apr 2024 03:43:30 -0700
From: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@...el.com>
To: dave.hansen@...ux.intel.com,
jarkko@...nel.org,
kai.huang@...el.com,
haitao.huang@...ux.intel.com,
reinette.chatre@...el.com,
linux-sgx@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: mona.vij@...el.com,
kailun.qin@...el.com,
stable@...r.kernel.org
Subject: [PATCH 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race
Two enclave threads may try to add and remove the same enclave page
simultaneously (e.g., if the SGX runtime supports both lazy allocation
and `MADV_DONTNEED` semantics). Consider this race:
1. T1 performs page removal in sgx_encl_remove_pages() and stops right
after removing the page table entry and right before re-acquiring the
enclave lock to EREMOVE and xa_erase(&encl->page_array) the page.
2. T2 tries to access the page, and #PF[not_present] is raised. The
condition to EAUG in sgx_vma_fault() is not satisfied because the
page is still present in encl->page_array, thus the SGX driver
assumes that the fault happened because the page was swapped out. The
driver continues on a code path that installs a page table entry
*without* performing EAUG.
3. The enclave page metadata is in inconsistent state: the PTE is
installed but there was no EAUG. Thus, T2 in userspace infinitely
receives SIGSEGV on this page (and EACCEPT always fails).
Fix this by making sure that T1 (the page-removing thread) always wins
this data race. In particular, the page-being-removed is marked as such,
and T2 retries until the page is fully removed.
Fixes: 9849bb27152c ("x86/sgx: Support complete page removal")
Cc: stable@...r.kernel.org
Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@...el.com>
---
arch/x86/kernel/cpu/sgx/encl.c | 3 ++-
arch/x86/kernel/cpu/sgx/encl.h | 3 +++
arch/x86/kernel/cpu/sgx/ioctl.c | 1 +
3 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 41f14b1a3025..7ccd8b2fce5f 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -257,7 +257,8 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
/* Entry successfully located. */
if (entry->epc_page) {
- if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)
+ if (entry->desc & (SGX_ENCL_PAGE_BEING_RECLAIMED |
+ SGX_ENCL_PAGE_BEING_REMOVED))
return ERR_PTR(-EBUSY);
return entry;
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index f94ff14c9486..fff5f2293ae7 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -25,6 +25,9 @@
/* 'desc' bit marking that the page is being reclaimed. */
#define SGX_ENCL_PAGE_BEING_RECLAIMED BIT(3)
+/* 'desc' bit marking that the page is being removed. */
+#define SGX_ENCL_PAGE_BEING_REMOVED BIT(2)
+
struct sgx_encl_page {
unsigned long desc;
unsigned long vm_max_prot_bits:8;
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index b65ab214bdf5..c542d4dd3e64 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -1142,6 +1142,7 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
* Do not keep encl->lock because of dependency on
* mmap_lock acquired in sgx_zap_enclave_ptes().
*/
+ entry->desc |= SGX_ENCL_PAGE_BEING_REMOVED;
mutex_unlock(&encl->lock);
sgx_zap_enclave_ptes(encl, addr);
--
2.34.1
Powered by blists - more mailing lists