[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bab464f5-a660-4122-886a-c348be3d95fa@oracle.com>
Date: Wed, 17 Jul 2024 14:29:22 -0400
From: Steven Sistare <steven.sistare@...cle.com>
To: Jason Wang <jasowang@...hat.com>
Cc: virtualization@...ts.linux-foundation.org, linux-kernel@...r.kernel.org,
"Michael S. Tsirkin" <mst@...hat.com>,
Si-Wei Liu <si-wei.liu@...cle.com>,
Eugenio Perez Martin <eperezma@...hat.com>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
Dragos Tatulea <dtatulea@...dia.com>
Subject: Re: [PATCH V2 0/7] vdpa live update
On 7/16/2024 1:30 AM, Jason Wang wrote:
> On Mon, Jul 15, 2024 at 10:29 PM Steven Sistare
> <steven.sistare@...cle.com> wrote:
>>
>> On 7/14/2024 10:14 PM, Jason Wang wrote:
>>> On Fri, Jul 12, 2024 at 9:19 PM Steve Sistare <steven.sistare@...cle.com> wrote:
>>>>
>>>> Live update is a technique wherein an application saves its state, exec's
>>>> to an updated version of itself, and restores its state. Clients of the
>>>> application experience a brief suspension of service, on the order of
>>>> 100's of milliseconds, but are otherwise unaffected.
>>>>
>>>> Define and implement interfaces that allow vdpa devices to be preserved
>>>> across fork or exec, to support live update for applications such as QEMU.
>>>> The device must be suspended during the update, but its DMA mappings are
>>>> preserved, so the suspension is brief.
>>>>
>>>> The VHOST_NEW_OWNER ioctl transfers device ownership and pinned memory
>>>> accounting from one process to another.
>>>>
>>>> The VHOST_BACKEND_F_NEW_OWNER backend capability indicates that
>>>> VHOST_NEW_OWNER is supported.
>>>>
>>>> The VHOST_IOTLB_REMAP message type updates a DMA mapping with its userland
>>>> address in the new process.
>>>>
>>>> The VHOST_BACKEND_F_IOTLB_REMAP backend capability indicates that
>>>> VHOST_IOTLB_REMAP is supported and required. Some devices do not
>>>> require it, because the userland address of each DMA mapping is discarded
>>>> after being translated to a physical address.
>>>>
>>>> Here is a pseudo-code sequence for performing live update, based on
>>>> suspend + reset because resume is not yet widely available. The vdpa device
>>>> descriptor, fd, remains open across the exec.
>>>>
>>>> ioctl(fd, VHOST_VDPA_SUSPEND)
>>>> ioctl(fd, VHOST_VDPA_SET_STATUS, 0)
>>>
>>> I don't understand why we need a reset after suspend, it looks to me
>>> the previous suspend became meaningless.
>>
>> The suspend guarantees completion of in-progress DMA. At least, that is
>> my interpretation of why that is done for live migration in QEMU, which
>> also does suspend + reset + re-create. I am following the live migration
>> model.
>
> Yes, but any reason we need a reset after the suspension?
Probably not. I found it cleanest to call reset and let new qemu configure the
device as it always does during startup, rather than altering those code paths
to skip the kernel calls. So, consider this to be just one of several possible
userland algorithms.
- Steve
>>>> exec
>>>>
>>>> ioctl(fd, VHOST_NEW_OWNER)
>>>>
>>>> issue ioctls to re-create vrings
>>>>
>>>> if VHOST_BACKEND_F_IOTLB_REMAP
>>>
>>> So the idea is for a device that is using a virtual address, it
>>> doesn't need VHOST_BACKEND_F_IOTLB_REMAP at all?
>>
>> Actually the reverse: if the device translates virtual to physical when
>> the mappings are created, and discards the virtual, then VHOST_IOTLB_REMAP
>> is not needed.
>
> Ok.
>
>>
>>>> foreach dma mapping
>>>> write(fd, {VHOST_IOTLB_REMAP, new_addr})
>>>>
>>>> ioctl(fd, VHOST_VDPA_SET_STATUS,
>>>> ACKNOWLEDGE | DRIVER | FEATURES_OK | DRIVER_OK)
>>>
>>> From API level, this seems to be asymmetric as we have suspending but
>>> not resuming?
>>
>> Again, I am just following the path taken by live migration.
>> I will be happy to use resume when the devices and QEMU support it.
>> The decision to use reset vs resume should not affect the definition
>> and use of VHOST_NEW_OWNER and VHOST_IOTLB_REMAP.
>>
>> - Steve
>
> Thanks
>
Powered by blists - more mailing lists