[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eae10132-a02d-b341-bdf5-be1b2096be4c@redhat.com>
Date: Fri, 25 Mar 2022 14:59:52 +0800
From: Gavin Shan <gshan@...hat.com>
To: Oliver Upton <oupton@...gle.com>
Cc: kvmarm@...ts.cs.columbia.edu, maz@...nel.org,
linux-kernel@...r.kernel.org, eauger@...hat.com,
shan.gavin@...il.com, Jonathan.Cameron@...wei.com,
pbonzini@...hat.com, vkuznets@...hat.com, will@...nel.org
Subject: Re: [PATCH v5 18/22] KVM: arm64: Support SDEI ioctl commands on VM
Hi Oliver,
On 3/24/22 1:28 AM, Oliver Upton wrote:
> On Tue, Mar 22, 2022 at 04:07:06PM +0800, Gavin Shan wrote:
>> This supports ioctl commands on VM to manage the various objects.
>> It's primarily used by VMM to accomplish migration. The ioctl
>> commands introduced by this are highlighted as below:
>>
>> * KVM_SDEI_CMD_GET_VERSION
>> Retrieve the version of current implementation. It's different
>> from the version of the followed SDEI specification. This version
>> is used to indicates what functionalities documented in the SDEI
>> specification have been supported or not supported.
>
> Don't we need a way to set the version as well? KVM is very much
> responsible for upholding ABI of older specs. So, if a VMM and guest
> expect SDEI v1.1, we can't just forcibly raise it to something else
> during a migration.
>
> The PSCI implementation is a great example of how KVM has grown its
> implementation in line with a specification, all the while preserving
> backwards compatibility.
>
The only information feed by VMM is the exposed events. The events
can't be registered from guest kernel, and raised from host to guest
kernel until it's exposed by VMM. Besides, the exposed events will
be defined staticly in host/KVM as we discussed on PATCH[02/22]. We
also discussed to eliminate those ioctl commands. So I think we needn't
to add KVM_SDEI_CMD_SET_VERSION. Further more, the version is only a
concern to host itself if the migration can be done through the
firmware pseudo system registers since the migration compatibility
is the only concern to VMM (QEMU).
Yes, Currently, 0.1/0.2/1.0 versions are supported by PSCI. 0.1 is
picked until VMM asks for 0.2 and 1.0 explicitly. However, it seems
QEMU isn't using 1.0 PSCI yet and maybe more patch is needed to enable
it.
>> * KVM_SDEI_CMD_GET_EXPOSED_EVENT_COUNT
>> Return the total count of exposed events.
>>
>> * KVM_SDEI_CMD_GET_EXPOSED_EVENT
>> * KVM_SDEI_CMD_SET_EXPOSED_EVENT
>> Get or set exposed event
>>
>> * KVM_SDEI_CMD_GET_REGISTERED_EVENT_COUNT
>> Return the total count of registered events.
>>
>> * KVM_SDEI_CMD_GET_REGISTERED_EVENT
>> * KVM_SDEI_CMD_SET_REGISTERED_EVENT
>> Get or set registered event.
>
> Any new UAPI needs to be documented in Documentation/virt/kvm/api.rst
>
> Additionally, we desperately need a better, generic way to save/restore
> VM scoped state. IMO, we should only be adding ioctls if we are
> affording userspace a meaningful interface. Every save/restore pair of
> ioctls winds up wasting precious ioctl numbers and requires userspace
> take a change to read/write an otherwise opaque value.
>
> Marc had made some suggestions in this area already that Raghavendra
> experimented with [1], and I think its time to meaningfully consider
> our options. Basically, KVM_GET_REG_LIST needs to convey whether a
> particular register is VM or vCPU state. We only need to save/restore a
> VM state register once. That way, userspace doesn't have to care about
> the underlying data and the next piece of VM state that comes along
> doesn't require an ioctl nr nor VMM participation.
>
> [1]: http://lore.kernel.org/r/20220224172559.4170192-3-rananta@google.com
>
Thanks for the pointer to Raghavendra's series. The firmware pseudo
system registers have been classified into VM and VCPU scoped in the
series. I think it fits the SDEI migration requirements very well.
The shared events can even be migrated through the VM scoped firmware
pseudo system registers. However, I don't plan to support it in next
revision (v6) as currently needed events are all private. I may
spend more time to go through Raghavendra's series later.
Thanks,
Gavin
Powered by blists - more mailing lists