[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1B23E216-0229-4BDD-8B09-807256A54AF5@amacapital.net>
Date: Fri, 18 Sep 2020 17:15:32 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Sean Christopherson <sean.j.christopherson@...el.com>
Cc: Andy Lutomirski <luto@...nel.org>,
Jarkko Sakkinen <jarkko.sakkinen@...ux.intel.com>,
X86 ML <x86@...nel.org>, linux-sgx@...r.kernel.org,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Jethro Beekman <jethro@...tanix.com>,
Darren Kenny <darren.kenny@...cle.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
asapek@...gle.com, Borislav Petkov <bp@...en8.de>,
"Xing, Cedric" <cedric.xing@...el.com>, chenalexchen@...gle.com,
Conrad Parker <conradparker@...gle.com>, cyhanish@...gle.com,
Dave Hansen <dave.hansen@...el.com>,
"Huang, Haitao" <haitao.huang@...el.com>,
Josh Triplett <josh@...htriplett.org>,
"Huang, Kai" <kai.huang@...el.com>,
"Svahn, Kai" <kai.svahn@...el.com>, Keith Moyer <kmoy@...gle.com>,
Christian Ludloff <ludloff@...gle.com>,
Neil Horman <nhorman@...hat.com>,
Nathaniel McCallum <npmccallum@...hat.com>,
Patrick Uiterwijk <puiterwijk@...hat.com>,
David Rientjes <rientjes@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>, yaozhangx@...gle.com
Subject: Re: [PATCH v38 10/24] mm: Add vm_ops->mprotect()
> On Sep 18, 2020, at 4:53 PM, Sean Christopherson <sean.j.christopherson@...el.com> wrote:
>
> On Fri, Sep 18, 2020 at 08:09:04AM -0700, Andy Lutomirski wrote:
>>> On Tue, Sep 15, 2020 at 4:28 AM Jarkko Sakkinen
>>> <jarkko.sakkinen@...ux.intel.com> wrote:
>>>
>>> From: Sean Christopherson <sean.j.christopherson@...el.com>
>>>
>>> Add vm_ops()->mprotect() for additional constraints for a VMA.
>>>
>>> Intel Software Guard eXtensions (SGX) will use this callback to add two
>>> constraints:
>>>
>>> 1. Verify that the address range does not have holes: each page address
>>> must be filled with an enclave page.
>>> 2. Verify that VMA permissions won't surpass the permissions of any enclave
>>> page within the address range. Enclave cryptographically sealed
>>> permissions for each page address that set the upper limit for possible
>>> VMA permissions. Not respecting this can cause #GP's to be emitted.
>
> Side note, #GP is wrong. EPCM violations are #PFs. Skylake CPUs #GP, but
> that's technically an errata. But this isn't the real motivation, e.g.
> userspace can already trigger #GP/#PF by reading/writing a bad address, SGX
> simply adds another flavor.
>
>> It's been awhile since I looked at this. Can you remind us: is this
>> just preventing userspace from shooting itself in the foot or is this
>> something more important?
>
> Something more important, it's used to prevent userspace from circumventing
> a noexec filesystem by loading code into an enclave, and to give the kernel the
> option of adding enclave specific LSM policies in the future.
>
> The source file (if one exists) for the enclave is long gone when the enclave
> is actually mmap()'d and mprotect()'d. To enforce noexec, the requested
> permissions for a given page are snapshotted when the page is added to the
> enclave, i.e. when the enclave is built. Enclave pages that will be executable
> must originate from an a MAYEXEC VMA, e.g. the source page can't come from a
> noexec file system.
>
> The ->mprotect() hook allows SGX to reject mprotect() if userspace is declaring
> permissions beyond what are allowed, e.g. trying to map an enclave page with
> EXEC permissions when the page was added to the enclave without EXEC.
>
> Future LSM policies have a similar need due to vm_file always pointing at
> /dev/sgx/enclave, e.g. policies couldn't be attached to a specific enclave.
> ->mprotect() again allows enforcing permissions at map time that were checked
> at enclave build time, e.g. via an LSM hook.
>
> Deferring ->mprotect() until LSM support is added (if it ever is) would be
> problematic due to SGX2. With SGX2, userspace can extend permissions of an
> enclave page (for the CPU's EPC Map entry, not the kernel's page tables)
> without bouncing through the kernel. Without ->mprotect () enforcement.
> userspace could do EADD(RW) -> mprotect(RWX) -> EMODPE(X) to gain W+X. We
> want to disallow such a flow now, i.e. force userspace to do EADD(RW,X), so
> that the hypothetical LSM hook would have all information at EADD(), i.e.
> would be aware of the EXEC permission, without creating divergent behavior
> based on whether or not an LSM is active.
That’s what I thought. Can we get this in the changelog?
Powered by blists - more mailing lists