[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3dc66987-49c7-abda-eb70-1898181ef3fe@redhat.com>
Date: Sat, 23 Sep 2023 11:24:44 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: David Woodhouse <dwmw2@...radead.org>
Cc: kvm@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
Sean Christopherson <seanjc@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
graf@...zon.de, Nicolas Saenz Julienne <nsaenz@...zon.es>,
"Griffoul, Fred" <fgriffo@...zon.com>
Subject: Re: [RFC] KVM: x86: Allow userspace exit on HLT and MWAIT, else yield
on MWAIT
On 9/23/23 09:22, David Woodhouse wrote:
> On Fri, 2023-09-22 at 14:00 +0200, Paolo Bonzini wrote:
>> To avoid races you need two flags though; there needs to be also a
>> kernel->userspace communication of whether the vCPU is currently in
>> HLT or MWAIT, using the "flags" field for example. If it was HLT only,
>> moving the mp_state in kvm_run would seem like a good idea; but not if
>> MWAIT or PAUSE are also included.
>
> Right. When work is added to an empty workqueue, the VMM will want to
> hunt for a vCPU which is currently idle and then signal it to exit.
>
> As you say, for HLT it's simple enough to look at the mp_state, and we
> can move that into kvm_run so it doesn't need an ioctl...
Looking at it again: not so easy because the mpstate is changed in the
vCPU thread by vcpu_block() itself.
> although it
> would also be nice to get an *event* on an eventfd when the vCPU
> becomes runnable (as noted, we want that for VSM anyway). Or perhaps
> even to be able to poll() on the vCPU fd.
Why do you need it? You can just use KVM_RUN to go to sleep, and if you
get another job you kick out the vCPU with pthread_kill. (I also didn't
get the VSM reference).
An interesting quirk is that kvm_run->immediate_exit is processed before
kvm_vcpu_block(), but TIF_SIGPENDING is processed afterwards. This
means that you can force an mpstate update with pthread_kill + KVM_RUN.
It's not going to be a speed demon, but it's worth writing a selftest
for it.
> But MWAIT (as currently not-really-emulated) and PAUSE are both just
> transient states with nothing you can really *wait* for, which is why
> they're such fun to deal with.
PAUSE is easier because it is just momentary and you stick it inside
what's already a busy wait. MWAIT is less fun because you don't really
want to busy wait.
Paolo
Powered by blists - more mailing lists