[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <384b2622-8d7f-ce02-1452-84a86e3a5697@linux.ibm.com>
Date: Mon, 24 Oct 2022 11:09:22 +0200
From: Christian Borntraeger <borntraeger@...ux.ibm.com>
To: Emanuele Giuseppe Esposito <eesposit@...hat.com>,
kvm@...r.kernel.org
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Jonathan Corbet <corbet@....net>,
Maxim Levitsky <mlevitsk@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
David Hildenbrand <david@...hat.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/4] KVM: API to block and resume all running vcpus in a
vm
Am 24.10.22 um 10:33 schrieb Emanuele Giuseppe Esposito:
>
>
> Am 24/10/2022 um 09:56 schrieb Christian Borntraeger:
>> Am 22.10.22 um 17:48 schrieb Emanuele Giuseppe Esposito:
>>> This new API allows the userspace to stop all running
>>> vcpus using KVM_KICK_ALL_RUNNING_VCPUS ioctl, and resume them with
>>> KVM_RESUME_ALL_KICKED_VCPUS.
>>> A "running" vcpu is a vcpu that is executing the KVM_RUN ioctl.
>>>
>>> This serie is especially helpful to userspace hypervisors like
>>> QEMU when they need to perform operations on memslots without the
>>> risk of having a vcpu reading them in the meanwhile.
>>> With "memslots operations" we mean grow, shrink, merge and split
>>> memslots, which are not "atomic" because there is a time window
>>> between the DELETE memslot operation and the CREATE one.
>>> Currently, each memslot operation is performed with one or more
>>> ioctls.
>>> For example, merging two memslots into one would imply:
>>> DELETE(m1)
>>> DELETE(m2)
>>> CREATE(m1+m2)
>>>
>>> And a vcpu could attempt to read m2 right after it is deleted, but
>>> before the new one is created.
>>>
>>> Therefore the simplest solution is to pause all vcpus in the kvm
>>> side, so that:
>>> - userspace just needs to call the new API before making memslots
>>> changes, keeping modifications to the minimum
>>> - dirty page updates are also performed when vcpus are blocked, so
>>> there is no time window between the dirty page ioctl and memslots
>>> modifications, since vcpus are all stopped.
>>> - no need to modify the existing memslots API
>> Isnt QEMU able to achieve the same goal today by forcing all vCPUs
>> into userspace with a signal? Can you provide some rationale why this
>> is better in the cover letter or patch description?
>>
> David Hildenbrand tried to propose something similar here:
> https://github.com/davidhildenbrand/qemu/commit/86b1bf546a8d00908e33f7362b0b61e2be8dbb7a
>
> While it is not optimized, I think it's more complex that the current
> serie, since qemu should also make sure all running ioctls finish and
> prevent the new ones from getting executed.
>
> Also we can't use pause_all_vcpus()/resume_all_vcpus() because they drop
> the BQL.
>
> Would that be ok as rationale?
Yes that helps and should be part of the cover letter for the next iterations.
Powered by blists - more mailing lists