[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7727faf5-1c13-f7f1-ede3-64cf131c7dc7@amazon.com>
Date: Mon, 31 Aug 2020 11:19:19 +0300
From: "Paraschiv, Andra-Irina" <andraprs@...zon.com>
To: Greg KH <gregkh@...uxfoundation.org>
CC: linux-kernel <linux-kernel@...r.kernel.org>,
Anthony Liguori <aliguori@...zon.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Colm MacCarthaigh <colmmacc@...zon.com>,
David Duncan <davdunc@...zon.com>,
Bjoern Doebel <doebel@...zon.de>,
David Woodhouse <dwmw@...zon.co.uk>,
"Frank van der Linden" <fllinden@...zon.com>,
Karen Noel <knoel@...hat.com>,
"Martin Pohlack" <mpohlack@...zon.de>,
Matt Wilson <msw@...zon.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Balbir Singh <sblbir@...zon.com>,
Stefano Garzarella <sgarzare@...hat.com>,
Stefan Hajnoczi <stefanha@...hat.com>,
Stewart Smith <trawets@...zon.com>,
Uwe Dannowski <uwed@...zon.de>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
kvm <kvm@...r.kernel.org>,
ne-devel-upstream <ne-devel-upstream@...zon.com>,
Alexander Graf <graf@...zon.de>
Subject: Re: [PATCH v7 00/18] Add support for Nitro Enclaves
On 19/08/2020 14:26, Greg KH wrote:
>
> On Wed, Aug 19, 2020 at 01:15:59PM +0200, Alexander Graf wrote:
>>
>> On 17.08.20 15:09, Andra Paraschiv wrote:
>>> Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability
>>> that allows customers to carve out isolated compute environments within EC2
>>> instances [1].
>>>
>>> For example, an application that processes sensitive data and runs in a VM,
>>> can be separated from other applications running in the same VM. This
>>> application then runs in a separate VM than the primary VM, namely an enclave.
>>>
>>> An enclave runs alongside the VM that spawned it. This setup matches low latency
>>> applications needs. The resources that are allocated for the enclave, such as
>>> memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a
>>> process running in the primary VM, that communicates with the NE driver via an
>>> ioctl interface.
>>>
>>> In this sense, there are two components:
>>>
>>> 1. An enclave abstraction process - a user space process running in the primary
>>> VM guest that uses the provided ioctl interface of the NE driver to spawn an
>>> enclave VM (that's 2 below).
>>>
>>> There is a NE emulated PCI device exposed to the primary VM. The driver for this
>>> new PCI device is included in the NE driver.
>>>
>>> The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl
>>> maps to an enclave start PCI command. The PCI device commands are then
>>> translated into actions taken on the hypervisor side; that's the Nitro
>>> hypervisor running on the host where the primary VM is running. The Nitro
>>> hypervisor is based on core KVM technology.
>>>
>>> 2. The enclave itself - a VM running on the same host as the primary VM that
>>> spawned it. Memory and CPUs are carved out of the primary VM and are dedicated
>>> for the enclave VM. An enclave does not have persistent storage attached.
>>>
>>> The memory regions carved out of the primary VM and given to an enclave need to
>>> be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of
>>> this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from
>>> user space [2][3]. The memory size for an enclave needs to be at least 64 MiB.
>>> The enclave memory and CPUs need to be from the same NUMA node.
>>>
>>> An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain
>>> available for the primary VM. A CPU pool has to be set for NE purposes by an
>>> user with admin capability. See the cpu list section from the kernel
>>> documentation [4] for how a CPU pool format looks.
>>>
>>> An enclave communicates with the primary VM via a local communication channel,
>>> using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device,
>>> while the enclave VM has a virtio-mmio vsock emulated device. The vsock device
>>> uses eventfd for signaling. The enclave VM sees the usual interfaces - local
>>> APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio
>>> device is placed in memory below the typical 4 GiB.
>>>
>>> The application that runs in the enclave needs to be packaged in an enclave
>>> image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the
>>> enclave VM. The enclave VM has its own kernel and follows the standard Linux
>>> boot protocol.
>>>
>>> The kernel bzImage, the kernel command line, the ramdisk(s) are part of the
>>> Enclave Image Format (EIF); plus an EIF header including metadata such as magic
>>> number, eif version, image size and CRC.
>>>
>>> Hash values are computed for the entire enclave image (EIF), the kernel and
>>> ramdisk(s). That's used, for example, to check that the enclave image that is
>>> loaded in the enclave VM is the one that was intended to be run.
>>>
>>> These crypto measurements are included in a signed attestation document
>>> generated by the Nitro Hypervisor and further used to prove the identity of the
>>> enclave; KMS is an example of service that NE is integrated with and that checks
>>> the attestation doc.
>>>
>>> The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The
>>> init process in the enclave connects to the vsock CID of the primary VM and a
>>> predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is
>>> used to check in the primary VM that the enclave has booted.
>>>
>>> If the enclave VM crashes or gracefully exits, an interrupt event is received by
>>> the NE driver. This event is sent further to the user space enclave process
>>> running in the primary VM via a poll notification mechanism. Then the user space
>>> enclave process can exit.
>>>
>>> Thank you.
>>>
>> This version reads very well, thanks a lot Andra!
>>
>> Greg, would you mind to have another look over it?
> Will do, it's in my to-review queue, behind lots of other patches...
>
I have a set of updates that can be included in a new revision, v8 e.g.
new NE custom error codes for invalid flags / enclave CID, "shutdown"
function for the NE PCI device driver, a couple more checks wrt invalid
flags and enclave vsock CID, documentation and sample updates. There is
also the option to have these updates as follow-up patches.
Greg, let me know what would work fine for you with regard to the review
of the patch series.
Thanks,
Andra
Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
Powered by blists - more mailing lists