[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <29ebdc29-2930-51af-8a54-279c1e449a48@amazon.de>
Date: Tue, 26 May 2020 15:44:30 +0200
From: Alexander Graf <graf@...zon.de>
To: Greg KH <gregkh@...uxfoundation.org>
CC: Andra Paraschiv <andraprs@...zon.com>,
<linux-kernel@...r.kernel.org>,
Anthony Liguori <aliguori@...zon.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Colm MacCarthaigh <colmmacc@...zon.com>,
"Bjoern Doebel" <doebel@...zon.de>,
David Woodhouse <dwmw@...zon.co.uk>,
"Frank van der Linden" <fllinden@...zon.com>,
Martin Pohlack <mpohlack@...zon.de>,
"Matt Wilson" <msw@...zon.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Balbir Singh <sblbir@...zon.com>,
Stefano Garzarella <sgarzare@...hat.com>,
"Stefan Hajnoczi" <stefanha@...hat.com>,
Stewart Smith <trawets@...zon.com>,
"Uwe Dannowski" <uwed@...zon.de>, <kvm@...r.kernel.org>,
<ne-devel-upstream@...zon.com>
Subject: Re: [PATCH v3 07/18] nitro_enclaves: Init misc device providing the
ioctl interface
On 26.05.20 15:17, Greg KH wrote:
>
> On Tue, May 26, 2020 at 02:44:18PM +0200, Alexander Graf wrote:
>>
>>
>> On 26.05.20 14:33, Greg KH wrote:
>>>
>>> On Tue, May 26, 2020 at 01:42:41PM +0200, Alexander Graf wrote:
>>>>
>>>>
>>>> On 26.05.20 08:51, Greg KH wrote:
>>>>>
>>>>> On Tue, May 26, 2020 at 01:13:23AM +0300, Andra Paraschiv wrote:
>>>>>> +#define NE "nitro_enclaves: "
>>>>>
>>>>> Again, no need for this.
>>>>>
>>>>>> +#define NE_DEV_NAME "nitro_enclaves"
>>>>>
>>>>> KBUILD_MODNAME?
>>>>>
>>>>>> +#define NE_IMAGE_LOAD_OFFSET (8 * 1024UL * 1024UL)
>>>>>> +
>>>>>> +static char *ne_cpus;
>>>>>> +module_param(ne_cpus, charp, 0644);
>>>>>> +MODULE_PARM_DESC(ne_cpus, "<cpu-list> - CPU pool used for Nitro Enclaves");
>>>>>
>>>>> Again, please do not do this.
>>>>
>>>> I actually asked her to put this one in specifically.
>>>>
>>>> The concept of this parameter is very similar to isolcpus= and maxcpus= in
>>>> that it takes CPUs away from Linux and instead donates them to the
>>>> underlying hypervisor, so that it can spawn enclaves using them.
>>>>
>>>> From an admin's point of view, this is a setting I would like to keep
>>>> persisted across reboots. How would this work with sysfs?
>>>
>>> How about just as the "initial" ioctl command to set things up? Don't
>>> grab any cpu pools until asked to. Otherwise, what happens when you
>>> load this module on a system that can't support it?
>>
>> That would give any user with access to the enclave device the ability to
>> remove CPUs from the system. That's clearly a CAP_ADMIN task in my book.
>
> Ok, what's wrong with that?
Would you want random users to get the ability to hot unplug CPUs from
your system? At unlimited quantity? I don't :).
>
>> Hence this whole split: The admin defines the CPU Pool, users can safely
>> consume this pool to spawn enclaves from it.
>
> But having the admin define that at module load / boot time, is a major
> pain. What tools do they have that allow them to do that easily?
The normal toolbox: editing /etc/default/grub, adding an
/etc/modprobe.d/ file.
When but at module load / boot time would you define it? I really don't
want to have a device node that in theory "the world" can use which then
allows any user on the system to hot unplug every CPU but 0 from my system.
>
>> So I really don't think an ioctl would be a great user experience. Same for
>> a sysfs file - although that's probably slightly better than the ioctl.
>
> You already are using ioctls to control this thing, right? What's wrong
> with "one more"? :)
So what we *could* do is add an ioctl to set the pool size which then
does a CAP_ADMIN check. That however means you now are in priority hell:
A user that wants to spawn an enclave as part of an nginx service would
need to create another service to set the pool size and indicate the
dependency in systemd control files.
Is that really better than a module parameter?
>
>> Other options I can think of:
>>
>> * sysctl (for modules?)
>
> Ick.
>
>> * module parameter (as implemented here)
>
> Ick.
>
>> * proc file (deprecated FWIW)
>
> Ick.
>
>> The key is the tenant split: Admin sets the pool up, user consumes. This
>> setup should happen (early) on boot, so that system services can spawn
>> enclaves.
>
> But it takes more than jus this initial "split up" to set the pool up,
I don't quite follow. The initial "split up" is all it takes. From the
hypervisor's point of view, the physical underlying cores will not be
used to schedule the parent as soon as an enclave is running on them.
Which CPUs are available for enclaves however is purely a parent OS
construct, hence the module parameter.
> right? Why not make this part of that initial process? What makes this
> so special you have to do this at module load time only?
What is the "initial process"? It's really 2 stages: One stage to create
a pool (CAP_ADMIN) which makes sure that some cores become invisible to
the Linux scheduler and one stage to spawn an enclave (normal user
permission) on those pool's CPUs.
Alex
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
Powered by blists - more mailing lists