[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <op.1il4zxeqwjvjmi@hhuan26-mobl1.mshome.net>
Date: Sun, 06 Mar 2022 08:24:59 -0600
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: "Jarkko Sakkinen" <jarkko@...nel.org>
Cc: "Reinette Chatre" <reinette.chatre@...el.com>,
"Dave Hansen" <dave.hansen@...el.com>,
"Dhanraj, Vijay" <vijay.dhanraj@...el.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"bp@...en8.de" <bp@...en8.de>,
"Lutomirski, Andy" <luto@...nel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
"Christopherson,, Sean" <seanjc@...gle.com>,
"Huang, Kai" <kai.huang@...el.com>,
"Zhang, Cathy" <cathy.zhang@...el.com>,
"Xing, Cedric" <cedric.xing@...el.com>,
"Huang, Haitao" <haitao.huang@...el.com>,
"Shanahan, Mark" <mark.shanahan@...el.com>,
"hpa@...or.com" <hpa@...or.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH V2 16/32] x86/sgx: Support restricting of enclave page
permissions
On Fri, 04 Mar 2022 19:02:28 -0600, Jarkko Sakkinen <jarkko@...nel.org>
wrote:
> On Fri, Mar 04, 2022 at 09:51:22AM -0600, Haitao Huang wrote:
>> Hi Jarkko
>>
>> On Fri, 04 Mar 2022 02:30:22 -0600, Jarkko Sakkinen <jarkko@...nel.org>
>> wrote:
>>
>> > On Thu, Mar 03, 2022 at 10:03:30PM -0600, Haitao Huang wrote:
>> > >
>> > > On Thu, 03 Mar 2022 17:18:33 -0600, Jarkko Sakkinen
>> <jarkko@...nel.org>
>> > > wrote:
>> > >
>> > > > On Thu, Mar 03, 2022 at 10:08:14AM -0600, Haitao Huang wrote:
>> > > > > Hi all,
>> > > > >
>> > > > > On Wed, 02 Mar 2022 16:57:45 -0600, Reinette Chatre
>> > > > > <reinette.chatre@...el.com> wrote:
>> > > > >
>> > > > > > Hi Jarkko,
>> > > > > >
>> > > > > > On 3/1/2022 6:05 PM, Jarkko Sakkinen wrote:
>> > > > > > > On Tue, Mar 01, 2022 at 09:48:48AM -0800, Reinette Chatre
>> wrote:
>> > > > > > > > Hi Jarkko,
>> > > > > > > >
>> > > > > > > > On 3/1/2022 5:42 AM, Jarkko Sakkinen wrote:
>> > > > > > > > > > With EACCEPTCOPY (kudos to Mark S. for reminding me of
>> > > > > > > > > > this version of
>> > > > > > > > > > EACCEPT @ chat.enarx.dev) it is possible to make R
>> and RX
>> > > > > pages but
>> > > > > > > > > > obviously new RX pages are now out of the picture:
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > /*
>> > > > > > > > > > * Adding a regular page that is architecturally
>> allowed
>> > > > > to only
>> > > > > > > > > > * be created with RW permissions.
>> > > > > > > > > > * TBD: Interface with user space policy to support
>> max
>> > > > > permissions
>> > > > > > > > > > * of RWX.
>> > > > > > > > > > */
>> > > > > > > > > > prot = PROT_READ | PROT_WRITE;
>> > > > > > > > > > encl_page->vm_run_prot_bits =
>> calc_vm_prot_bits(prot, 0);
>> > > > > > > > > > encl_page->vm_max_prot_bits =
>> > > encl_page->vm_run_prot_bits;
>> > > > > > > > > >
>> > > > > > > > > > If that TBD is left out to the final version the page
>> > > > > > > > > > augmentation has a
>> > > > > > > > > > risk of a API bottleneck, and that risk can realize
>> then
>> > > > > > > > > > also in the page
>> > > > > > > > > > permission ioctls.
>> > > > > > > > > >
>> > > > > > > > > > I.e. now any review comment is based on not fully
>> known
>> > > > > > > > > > territory, we have
>> > > > > > > > > > one known unknown, and some unknown unknowns from
>> > > > > > > > > > unpredictable effect to
>> > > > > > > > > > future API changes.
>> > > > > > > >
>> > > > > > > > The plan to complete the "TBD" in the above snippet was to
>> > > > > > > > follow this work
>> > > > > > > > with user policy integration at this location. On a high
>> level
>> > > > > > > > the plan was
>> > > > > > > > for this to look something like:
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > /*
>> > > > > > > > * Adding a regular page that is architecturally allowed
>> > > to only
>> > > > > > > > * be created with RW permissions.
>> > > > > > > > * Interface with user space policy to support max
>> > > permissions
>> > > > > > > > * of RWX.
>> > > > > > > > */
>> > > > > > > > prot = PROT_READ | PROT_WRITE;
>> > > > > > > > encl_page->vm_run_prot_bits = calc_vm_prot_bits(prot,
>> 0);
>> > > > > > > >
>> > > > > > > > if (user space policy allows RWX on dynamically
>> added
>> > > > > pages)
>> > > > > > > > encl_page->vm_max_prot_bits =
>> calc_vm_prot_bits(PROT_READ |
>> > > > > > > > PROT_WRITE | PROT_EXEC, 0);
>> > > > > > > > else
>> > > > > > > > encl_page->vm_max_prot_bits =
>> calc_vm_prot_bits(PROT_READ |
>> > > > > > > > PROT_WRITE, 0);
>> > > > > > > >
>> > > > > > > > The work that follows this series aimed to do the
>> integration
>> > > > > with user
>> > > > > > > > space policy.
>> > > > > > >
>> > > > > > > What do you mean by "user space policy" anyway exactly? I'm
>> > > > > sorry but I
>> > > > > > > just don't fully understand this.
>> > > > > >
>> > > > > > My apologies - I just assumed that you would need no reminder
>> > > > > about this
>> > > > > > contentious
>> > > > > > part of SGX history. Essentially it means that, yes, the
>> > > kernel could
>> > > > > > theoretically
>> > > > > > permit any kind of access to any file/page, but some accesses
>> are
>> > > > > known
>> > > > > > to generally
>> > > > > > be a bad idea - like making memory executable as well as
>> writable
>> > > > > - and
>> > > > > > thus there
>> > > > > > are additional checks based on what user space permits before
>> the
>> > > > > kernel
>> > > > > > allows
>> > > > > > such accesses.
>> > > > > >
>> > > > > > For example,
>> > > > > > mm/mprotect.c:do_mprotect_pkey()->security_file_mprotect()
>> > > > > >
>> > > > > > User policy and SGX has seen significant discussion. Some
>> notable
>> > > > > > threads:
>> > > > > >
>> https://lore.kernel.org/linux-security-module/CALCETrXf8mSK45h7sTK5Wf+pXLVn=Bjsc_RLpgO-h-qdzBRo5Q@mail.gmail.com/
>> > > > > >
>> https://lore.kernel.org/linux-security-module/20190619222401.14942-1-sean.j.christopherson@intel.com/
>> > > > > >
>> > > > > > > It's too big of a risk to accept this series without X taken
>> > > care
>> > > > > > > of. Patch
>> > > > > > > series should neither have TODO nor TBD comments IMHO. I
>> > > don't want
>> > > > > > > to ack
>> > > > > > > a series based on speculation what might happen in the
>> future.
>> > > > > >
>> > > > > > ok
>> > > > > >
>> > > > > > >
>> > > > > > > > > I think the best way to move forward would be to do
>> EAUG's
>> > > > > > > > > explicitly with
>> > > > > > > > > an ioctl that could also include secinfo for
>> permissions.
>> > > > > Then you can
>> > > > > > > > > easily do the rest with EACCEPTCOPY inside the enclave.
>> > > > > > > >
>> > > > > > > > SGX_IOC_ENCLAVE_ADD_PAGES already exists and could
>> possibly be
>> > > > > used for
>> > > > > > > > this purpose. It already includes SECINFO which may also
>> be
>> > > > > useful if
>> > > > > > > > needing to later support EAUG of PT_SS* pages.
>> > > > > > >
>> > > > > > > You could also simply add SGX_IOC_ENCLAVE_AUGMENT_PAGES and
>> > > call it
>> > > > > > > a day.
>> > > > > >
>> > > > > > I could, yes.
>> > > > > >
>> > > > > > > And if there is plan to extend SGX_IOC_ENCLAVE_ADD_PAGES
>> what is
>> > > > > > > this weird
>> > > > > > > thing added to the #PF handler? Why is it added at all then?
>> > > > > >
>> > > > > > I was just speculating in my response, there is no plan to
>> extend
>> > > > > > SGX_IOC_ENCLAVE_ADD_PAGES (that I am aware of).
>> > > > > >
>> > > > > > > > How this could work is user space calls
>> > > SGX_IOC_ENCLAVE_ADD_PAGES
>> > > > > > > > after enclave initialization on any memory region within
>> the
>> > > > > > > > enclave where
>> > > > > > > > pages are planned to be added dynamically. This ioctl()
>> calls
>> > > > > > > > EAUG to add the
>> > > > > > > > new pages with RW permissions and their vm_max_prot_bits
>> > > can be
>> > > > > > > > set to the
>> > > > > > > > permissions found in the included SECINFO. This will
>> support
>> > > > > > > > later EACCEPTCOPY
>> > > > > > > > as well as SGX_IOC_ENCLAVE_RELAX_PERMISSIONS
>> > > > > > >
>> > > > > > > I don't like this type of re-use of the existing API.
>> > > > > >
>> > > > > > I could proceed with SGX_IOC_ENCLAVE_AUGMENT_PAGES if there is
>> > > > > consensus
>> > > > > > after
>> > > > > > considering the user policy question (above) and performance
>> > > trade-off
>> > > > > > (more below).
>> > > > > >
>> > > > > > >
>> > > > > > > > The big question is whether communicating user policy
>> after
>> > > > > > > > enclave initialization
>> > > > > > > > via the SECINFO within SGX_IOC_ENCLAVE_ADD_PAGES is
>> acceptable
>> > > > > > > > to all? I would
>> > > > > > > > appreciate a confirmation on this direction considering
>> the
>> > > > > > > > significant history
>> > > > > > > > behind this topic.
>> > > > > > >
>> > > > > > > I have no idea because I don't know what is user space
>> policy.
>> > > > > >
>> > > > > > This discussion is about some enclave usages needing RWX
>> > > permissions
>> > > > > > on dynamically added enclave pages. RWX permissions on
>> dynamically
>> > > > > added
>> > > > > > pages is
>> > > > > > not something that should blindly be allowed for all SGX
>> > > enclaves but
>> > > > > > instead the user
>> > > > > > needs to explicitly allow specific enclaves to have such
>> > > ability. This
>> > > > > > is equivalent
>> > > > > > to (but not the same as) what exists in Linux today with LSM.
>> As
>> > > > > seen in
>> > > > > > mm/mprotect.c:do_mprotect_pkey()->security_file_mprotect()
>> Linux
>> > > > > is able
>> > > > > > to make
>> > > > > > files and memory be both writable and executable, but it would
>> > > only do
>> > > > > > so for those
>> > > > > > files and memory that the LSM (which is how user policy is
>> > > > > communicated,
>> > > > > > like SELinux)
>> > > > > > indicates it is allowed, not blindly do so for all files and
>> all
>> > > > > memory.
>> > > > > >
>> > > > > > > > > Putting EAUG to the #PF handler and implicitly call it
>> just
>> > > > > > > > > too flakky and
>> > > > > > > > > hard to make deterministic for e.g. JIT compiler in our
>> use
>> > > > > > > > > case (not to
>> > > > > > > > > mention that JIT is not possible at all because
>> inability to
>> > > > > > > > > do RX pages).
>> > > > > >
>> > > > > > I understand how SGX_IOC_ENCLAVE_AUGMENT_PAGES can be more
>> > > > > deterministic
>> > > > > > but from
>> > > > > > what I understand it would have a performance impact since it
>> > > would
>> > > > > > require all memory
>> > > > > > that may be needed by the enclave be pre-allocated from
>> > > outside the
>> > > > > > enclave and not
>> > > > > > just dynamically allocated from within the enclave at the time
>> > > it is
>> > > > > > needed.
>> > > > > >
>> > > > > > Would such a performance impact be acceptable?
>> > > > > >
>> > > > >
>> > > > > User space won't always have enough info to decide whether the
>> pages
>> > > > > to be
>> > > > > EAUG'd immediately. In some cases (shared libraries, JVM for
>> > > > > example) lots
>> > > > > of code/data pages can be mapped but never actually touched. One
>> > > > > enclave/process does not know if any other more important
>> > > > > enclave/process
>> > > > > would need the EPC.
>> > > > >
>> > > > > It should be for kernel to make the final decision as it has
>> overall
>> > > > > picture
>> > > > > of the system EPC usage and availability.
>> > > >
>> > > > EAUG ioctl does not give better capabilities for user space to
>> waste
>> > > > EPC given that EADD ioctl already exists, i.e. your argument is
>> > > logically
>> > > > incorrect.
>> > >
>> > > The point of adding EAUG is to allow more efficient use of EPC
>> pages.
>> > > Without EAUG, enclaves have to EADD everything upfront into EPC,
>> > > consuming
>> > > predetermined number of EPC pages, some of which may not be used at
>> all.
>> > > With EAUG, enclaves should be able to load minimal pages to get
>> started,
>> > > pages added on #PF as they are actually accessed.
>> > >
>> > > Obviously as you pointed out, some usages make more sense to
>> > > pre-EAUG (EAUG
>> > > before #PF). But your proposal of supporting only pre-EAUG here
>> > > essentially
>> > > makes EAUG behave almost the same as EADD. If the current
>> > > implementation
>> > > with EAUG on #PF can also use MAP_POPULATE for pre-EAUG (seems
>> possible
>> > > based on Dave's comments), then it is flxible to cover all cases and
>> > > allow
>> > > kernel to optimize allocation of EPC pages.
>> >
>> > There is no even a working #PF based implementation in existance, and
>> > your
>> > argument has too many if's for my taste.
>>
>> 1) if you mean no user space is implementing this kind of solution, read
>> this section, otherwise, skip to 2) below which is only couple of
>> sentences.
>>
>> If you are willing to look, there is already implementation in our SDK
>> to do
>> heap and stack expansion on demand on #PF. Enclaves may not know
>> heap/stack
>> size up front, we have implemented these features to make EPC usage more
>> efficient. I don't know why normal processes can add RAM on #PF, but
>> enclaves adding EPC on #PF becomes so unacceptable concept to you. And
>> the
>> kernel does that for EPC swapping already when #PF happens on a swapped
>> out
>> EPC page.
>
> In adds O(n) round-trips for a mmap() emulation, which can be done in
> O(1)
> round-trips with a ioctl.
>
>> Our implementation has gone through several rounds, the latest is
>> here:https://github.com/intel/linux-sgx/tree/edmm_v2/sdk/emm. It was
>> also
>> implemented in original OOT driver based SDK implementation. Customers
>> are
>> using it and found them useful. I think this is a critical feature that
>> many
>> other runtimes will also need.
>
> I'm not sure what the common sense argument here is.
>
My (wrong) assumption was that you are disabling EAUG on #PF totally, and
all I was saying EAUG on #PF is critical for many usages and disabling it
requires good justification.
But you are expecting an ioctl call for each #PF for those usages:
https://lore.kernel.org/linux-sgx/YiK8NEnvgPerEdFB@iki.fi/#t. IIUC, that's
better than total disabling but less optimal. (I have not checked all call
sequences in detail to be sure it would work for all our cases)
>> 2)
>> It's OK for you to request additional support for your usage and I
>> agree it
>> is needed. But IMHO, totally getting rid of EAUG on #PF is bad and
>> unnecessary. Current implementation can be extended to support your
>> usage.
>> What's the reason you think MAP_POPULATE won't work for you?
>
> I do not recall taking stand on MAP_POPULATE.
Thanks for looking into that. Like I said, that should cover all usages.
Powered by blists - more mailing lists