linux-kernel - Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181218154417.GC28326@linux.intel.com>
Date:   Tue, 18 Dec 2018 07:44:18 -0800
From:   Sean Christopherson <sean.j.christopherson@...el.com>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Dave Hansen <dave.hansen@...el.com>,
        Jarkko Sakkinen <jarkko.sakkinen@...ux.intel.com>,
        X86 ML <x86@...nel.org>,
        Platform Driver <platform-driver-x86@...r.kernel.org>,
        linux-sgx@...r.kernel.org, nhorman@...hat.com,
        npmccallum@...hat.com, "Ayoun, Serge" <serge.ayoun@...el.com>,
        shay.katz-zamir@...el.com,
        Haitao Huang <haitao.huang@...ux.intel.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Svahn, Kai" <kai.svahn@...el.com>, mark.shanahan@...el.com,
        Suresh Siddha <suresh.b.siddha@...el.com>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Darren Hart <dvhart@...radead.org>,
        Andy Shevchenko <andy@...radead.org>,
        "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" 
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

On Mon, Dec 17, 2018 at 08:59:54PM -0800, Andy Lutomirski wrote:
> On Mon, Dec 17, 2018 at 2:20 PM Sean Christopherson
> <sean.j.christopherson@...el.com> wrote:
> >
> 
> > My brain is still sorting out the details, but I generally like the idea
> > of allocating an anon inode when creating an enclave, and exposing the
> > other ioctls() via the returned fd.  This is essentially the approach
> > used by KVM to manage multiple "layers" of ioctls across KVM itself, VMs
> > and vCPUS.  There are even similarities to accessing physical memory via
> > multiple disparate domains, e.g. host kernel, host userspace and guest.
> >
> 
> In my mind, opening /dev/sgx would give you the requisite inode.  I'm
> not 100% sure that the chardev infrastructure allows this, but I think
> it does.

My fd/inode knowledge is lacking, to say the least.  Whatever works, so
long as we have a way to uniquely identify enclaves.

> > The only potential hiccup I can see is the build flow.  Currently,
> > EADD+EEXTEND is done via a work queue to avoid major performance issues
> > (10x regression) when userspace is building multiple enclaves in parallel
> > using goroutines to wrap Cgo (the issue might apply to any M:N scheduler,
> > but I've only confirmed the Golang case).  The issue is that allocating
> > an EPC page acts like a blocking syscall when the EPC is under pressure,
> > i.e. an EPC page isn't immediately available.  This causes Go's scheduler
> > to thrash and tank performance[1].
> 
> What's the issue, and how does a workqueue help?  I'm wondering if a
> nicer solution would be an ioctl to add lots of pages in a single
> call.

Adding pages via workqueue makes the ioctl itself fast enough to avoid
triggering Go's rescheduling.  A batched EADD flow would likely help,
I just haven't had the time to rework the userspace side to be able to
test the performance.

> >
> > Alternatively, we could change the EADD+EEXTEND flow to not insert the
> > added page's PFN into the owner's process space, i.e. force userspace to
> > fault when it runs the enclave.  But that only delays the issue because
> > eventually we'll want to account EPC pages, i.e. add a cgroup, at which
> > point we'll likely need current->mm anyways.
> 
> You should be able to account the backing pages to a cgroup without
> actually sticking them into the EPC, no?  Or am I misunderstanding?  I
> guess we'll eventually want a cgroup to limit use of the limited EPC
> resources.

It's the latter, a cgroup to limit EPC.  The mm is used to retrieve the
cgroup without having track e.g. the task_struct.