[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181217180957.GC12491@linux.intel.com>
Date: Mon, 17 Dec 2018 10:09:57 -0800
From: Sean Christopherson <sean.j.christopherson@...el.com>
To: Jarkko Sakkinen <jarkko.sakkinen@...ux.intel.com>
Cc: "Dr. Greg" <greg@...ellic.com>,
Andy Lutomirski <luto@...capital.net>,
Andy Lutomirski <luto@...nel.org>, X86 ML <x86@...nel.org>,
Platform Driver <platform-driver-x86@...r.kernel.org>,
linux-sgx@...r.kernel.org, Dave Hansen <dave.hansen@...el.com>,
nhorman@...hat.com, npmccallum@...hat.com,
"Ayoun, Serge" <serge.ayoun@...el.com>, shay.katz-zamir@...el.com,
haitao.huang@...ux.intel.com,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>,
"Svahn, Kai" <kai.svahn@...el.com>, mark.shanahan@...el.com,
Suresh Siddha <suresh.b.siddha@...el.com>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
Darren Hart <dvhart@...radead.org>,
Andy Shevchenko <andy@...radead.org>,
LKML <linux-kernel@...r.kernel.org>, jethro@...tanix.com
Subject: Re: [PATCH v17 18/23] platform/x86: Intel SGX driver
On Mon, Dec 17, 2018 at 07:49:35PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 09:31:06AM -0800, Sean Christopherson wrote:
> > This doesn't work as-is. sgx_encl_release() needs to use sgx_free_page()
> > and not __sgx_free_page() so that we get a WARN() if the page can't be
> > freed. sgx_invalidate() needs to use __sgx_free_page() as freeing a page
> > can fail due to running concurrently with reclaim. I'll play around with
> > the code a bit, there's probably a fairly clean way to share code between
> > the two flows.
>
> Hmm... but why issue a warning in that case? It should be legit
> behaviour.
No, EREMOVE should never fail if the enclave is being released, i.e. all
references to the enclave are gone. And failure during sgx_encl_release()
means we leaked an EPC page, which warrants a WARN.
The only legitimate reason __sgx_free_page() can fail in sgx_invalidate()
is because a page might be in the process of being reclaimed. We could
theoretically WARN on EREMOVE failure in that case, but it'd make the code
a little fragile and it's not "fatal" in the sense that we get a second
chance to free the page during sgx_encl_release().
And unless I missed something, using sgx_invalidate() means were' leaking
all sgx_encl_page structs as well as the radix tree entries.
> > sgx_encl_release_worker() calls do_unmap() without checking the validity
> > of the page tables[1]. As is, the code doesn't even guarantee mm_struct
> > itself is valid.
> >
> > The easiest fix I can think of is to add a SGX_ENCL_MM_RELEASED flag
> > that is set along with SGX_ENCL_DEAD in sgx_mmu_notifier_release(), and
> > only call do_unmap() if SGX_ENCL_MM_RELEASED is false. Note that this
> > means we cant unregister the mmu_notifier until after do_unmap(), but
> > that's true no matter what since we're relying on the mmu_notifier to
> > hold a reference to mm_struct. Patch attached.
>
> OK, the fix change makes sense but I'm thinking that would it be a
> better idea just to set mm NULL and check that instead?
That makes sense.
Powered by blists - more mailing lists