[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YQHhd0qKZqMCWqks@google.com>
Date: Wed, 28 Jul 2021 23:00:07 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Tony Luck <tony.luck@...el.com>,
Jarkko Sakkinen <jarkko@...nel.org>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 4/7] x86/sgx: Add SGX infrastructure to recover from
poison
On Wed, Jul 28, 2021, Dave Hansen wrote:
> On 7/28/21 1:46 PM, Tony Luck wrote:
> > +int sgx_memory_failure(unsigned long pfn, int flags)
> > +{
> ...
> > + page->flags |= SGX_EPC_PAGE_POISON;
>
> Is this safe outside of any locks?
It's safe outside of sgx_reclaimer_lock iff this can guarantee nothing else can
reach the page. I'm pretty sure that doesn't hold true here.
> I see the reclaimer doing things like:
>
> epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;
>
> I'd worry that this code and other non-atomic epc_page->flags
> manipulation could trample on each other.
>
> This might need to some some atomic bit manipulation *and* convert all
> the other epc_page->flags users.
I don't think atomics would be sufficient as that would open all sorts of possible
races. E.g. this new code in __sgx_sanitize_pages()
page = list_first_entry(dirty_page_list, struct sgx_epc_page, list);
+ if (page->flags & SGX_EPC_PAGE_POISON) {
+ list_del(&page->list);
+ continue;
+ }
+
***HERE***
ret = __eremove(sgx_get_epc_virt_addr(page));
could attempt EREMOVE on a freshly POISONed page. That appears to be "benign"
since ENCLS is wrapped with_ASM_EXTABLE_FAULT, but it feels wrong to add a check
that we know can race.
And similar races for allocation/free could hand out a poisoned page or add one
to the free list.
@@ -585,6 +600,10 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
for ( ; ; ) {
page = __sgx_alloc_epc_page();
+
+ if (page->flags & SGX_EPC_PAGE_POISON)
+ continue;
*** HERE ***
+
@@ -630,7 +651,8 @@ void sgx_free_epc_page(struct sgx_epc_page *page)
spin_lock(&node->lock);
page->owner = NULL;
- list_add_tail(&page->list, &node->free_page_list);
+ if (!(page->flags & SGX_EPC_PAGE_POISON))
*** HERE ***
+ list_add_tail(&page->list, &node->free_page_list);
Setting POISON and hoping we eventually notice doesn't sound robust. Maybe some
of these races are unavoidable due to the nature of #MC delivery, but I would hope
the kernel can at least avoid handing out a poisoned page to a different enclave.
Powered by blists - more mailing lists