lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 27 May 2024 09:54:17 +0200
From: Alexander Graf <graf@...zon.com>
To: Stefano Garzarella <sgarzare@...hat.com>, Alexander Graf <agraf@...raf.de>
CC: Dorjoy Chowdhury <dorjoychy111@...il.com>,
	<virtualization@...ts.linux.dev>, <kvm@...r.kernel.org>,
	<netdev@...r.kernel.org>, <stefanha@...hat.com>
Subject: Re: How to implement message forwarding from one CID to another in
 vhost driver


On 27.05.24 09:08, Alexander Graf wrote:
> Hey Stefano,
>
> On 23.05.24 10:45, Stefano Garzarella wrote:
>> On Tue, May 21, 2024 at 08:50:22AM GMT, Alexander Graf wrote:
>>> Howdy,
>>>
>>> On 20.05.24 14:44, Dorjoy Chowdhury wrote:
>>>> Hey Stefano,
>>>>
>>>> Thanks for the reply.
>>>>
>>>>
>>>> On Mon, May 20, 2024, 2:55 PM Stefano Garzarella 
>>>> <sgarzare@...hat.com> wrote:
>>>>> Hi Dorjoy,
>>>>>
>>>>> On Sat, May 18, 2024 at 04:17:38PM GMT, Dorjoy Chowdhury wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Hope you are doing well. I am working on adding AWS Nitro Enclave[1]
>>>>>> emulation support in QEMU. Alexander Graf is mentoring me on this 
>>>>>> work. A v1
>>>>>> patch series has already been posted to the qemu-devel mailing 
>>>>>> list[2].
>>>>>>
>>>>>> AWS nitro enclaves is an Amazon EC2[3] feature that allows 
>>>>>> creating isolated
>>>>>> execution environments, called enclaves, from Amazon EC2 
>>>>>> instances, which are
>>>>>> used for processing highly sensitive data. Enclaves have no 
>>>>>> persistent storage
>>>>>> and no external networking. The enclave VMs are based on 
>>>>>> Firecracker microvm
>>>>>> and have a vhost-vsock device for communication with the parent 
>>>>>> EC2 instance
>>>>>> that spawned it and a Nitro Secure Module (NSM) device for 
>>>>>> cryptographic
>>>>>> attestation. The parent instance VM always has CID 3 while the 
>>>>>> enclave VM gets
>>>>>> a dynamic CID. The enclave VMs can communicate with the parent 
>>>>>> instance over
>>>>>> various ports to CID 3, for example, the init process inside an 
>>>>>> enclave sends a
>>>>>> heartbeat to port 9000 upon boot, expecting a heartbeat reply, 
>>>>>> letting the
>>>>>> parent instance know that the enclave VM has successfully booted.
>>>>>>
>>>>>> The plan is to eventually make the nitro enclave emulation in 
>>>>>> QEMU standalone
>>>>>> i.e., without needing to run another VM with CID 3 with proper vsock
>>>>> If you don't have to launch another VM, maybe we can avoid 
>>>>> vhost-vsock
>>>>> and emulate virtio-vsock in user-space, having complete control 
>>>>> over the
>>>>> behavior.
>>>>>
>>>>> So we could use this opportunity to implement virtio-vsock in QEMU 
>>>>> [4]
>>>>> or use vhost-user-vsock [5] and customize it somehow.
>>>>> (Note: vhost-user-vsock already supports sibling communication, so 
>>>>> maybe
>>>>> with a few modifications it fits your case perfectly)
>>>>>
>>>>> [4] https://gitlab.com/qemu-project/qemu/-/issues/2095
>>>>> [5] 
>>>>> https://github.com/rust-vmm/vhost-device/tree/main/vhost-device-vsock
>>>>
>>>>
>>>> Thanks for letting me know. Right now I don't have a complete picture
>>>> but I will look into them. Thank you.
>>>>>
>>>>>
>>>>>> communication support. For this to work, one approach could be to 
>>>>>> teach the
>>>>>> vhost driver in kernel to forward CID 3 messages to another CID N
>>>>> So in this case both CID 3 and N would be assigned to the same QEMU
>>>>> process?
>>>>
>>>>
>>>> CID N is assigned to the enclave VM. CID 3 was supposed to be the
>>>> parent VM that spawns the enclave VM (this is how it is in AWS, where
>>>> an EC2 instance VM spawns the enclave VM from inside it and that
>>>> parent EC2 instance always has CID 3). But in the QEMU case as we
>>>> don't want a parent VM (we want to run enclave VMs standalone) we
>>>> would need to forward the CID 3 messages to host CID. I don't know if
>>>> it means CID 3 and CID N is assigned to the same QEMU process. Sorry.
>>>
>>>
>>> There are 2 use cases here:
>>>
>>> 1) Enclave wants to treat host as parent (default). In this scenario,
>>> the "parent instance" that shows up as CID 3 in the Enclave doesn't
>>> really exist. Instead, when the Enclave attempts to talk to CID 3, it
>>> should really land on CID 0 (hypervisor). When the hypervisor tries to
>>> connect to the Enclave on port X, it should look as if it originates
>>> from CID 3, not CID 0.
>>>
>>> 2) Multiple parent VMs. Think of an actual cloud hosting scenario.
>>> Here, we have multiple "parent instances". Each of them thinks it's
>>> CID 3. Each can spawn an Enclave that talks to CID 3 and reach the
>>> parent. For this case, I think implementing all of virtio-vsock in
>>> user space is the best path forward. But in theory, you could also
>>> swizzle CIDs to make random "real" CIDs appear as CID 3.
>>>
>>
>> Thank you for clarifying the use cases!
>>
>> Also for case 1, vhost-vsock doesn't support CID 0, so in my opinion
>> it's easier to go into user-space with vhost-user-vsock or the built-in
>> device.
>
>
> Sorry, I believe I meant CID 2. Effectively for case 1, when a process 
> on the hypervisor listens on port 1234, it should be visible as 3:1234 
> from the VM and when the hypervisor process connects to <VM CID>:1234, 
> it should look as if that connection came from CID 3.


Now that I'm thinking about my message again: What if we just introduce 
a sysfs/sysctl file for vsock that indicates the "host CID" (default: 
2)? Users that want vhost-vsock to behave as if the host is CID 3 can 
just write 3 to it.

It means we'd need to change all references to VMADDR_CID_HOST to 
instead refer to a global variable that indicates the new "host CID". 
It'd need some more careful massaging to not break number namespace 
assumptions (<= CID_HOST no longer works), but the idea should fly.

That would give us all 3 options:

1) User sets vsock.host_cid = 3 to simulate that the host is in reality 
an enclave parent
2) User spawns VM with CID = 3 to run parent payload inside
3) User spawns parent and enclave VMs with vhost-vsock-user which 
creates its own CID namespace


Stefano, WDYT?


Alex




Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ