[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87fsw82frw.wl-maz@kernel.org>
Date: Wed, 21 Jul 2021 10:10:59 +0100
From: Marc Zyngier <maz@...nel.org>
To: Sergey Senozhatsky <senozhatsky@...omium.org>
Cc: Will Deacon <will@...nel.org>,
Suleiman Souhlal <suleiman@...gle.com>,
Joel Fernandes <joelaf@...gle.com>,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
linux-kernel@...r.kernel.org,
virtualization@...ts.linux-foundation.org
Subject: Re: [PATCHv2 4/4] arm64: add host pv-vcpu-state support
On Wed, 21 Jul 2021 02:15:47 +0100,
Sergey Senozhatsky <senozhatsky@...omium.org> wrote:
>
> On (21/07/12 17:24), Marc Zyngier wrote:
> > >
> > > void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > > {
> > > + kvm_update_vcpu_preempted(vcpu, true);
> >
> > This doesn't look right. With this, you are now telling the guest that
> > a vcpu that is blocked on WFI is preempted. This really isn't the
> > case, as it has voluntarily entered a low-power mode while waiting for
> > an interrupt. Indeed, the vcpu isn't running. A physical CPU wouldn't
> > be running either.
>
> I suppose you are talking about kvm_vcpu_block().
kvm_vcpu_block() is how things are implemented. WFI is the instruction
I'm concerned about.
> Well, it checks kvm_vcpu_check_block() but then it simply schedule()
> out the vcpu process, which does look like "the vcpu is
> preempted". Once we sched_in() that vcpu process again we mark it as
> non-preempted, even though it remains in kvm wfx handler. Why isn't
> it right?
Because the vcpu hasn't been "preempted". It has *voluntarily* gone
into a low-power mode, and how KVM implements this "low-power mode" is
none of the guest's business. This is exactly the same behaviour that
you will have on bare metal. From a Linux guest perspective, the vcpu
is *idle*, not doing anything, and only waiting for an interrupt to
start executing again.
This is a fundamentally different concept from preempting a vcpu
because its time-slice is up. In this second case, you can indeed
mitigate things by exposing steal time and preemption status as you
break the illusion of a machine that is completely controlled by the
guest.
If the "reched on interrupt delivery while blocked on WFI" is too slow
for you, then *that* is the thing that needs addressing. Feeding extra
state to the guest doesn't help.
> Another call path is iret:
>
> <iret>
> __schedule()
> context_switch()
> prepare_task_switch()
> fire_sched_in_preempt_notifiers()
> kvm_sched_out()
> kvm_arch_vcpu_put()
I'm not sure how a x86 concept is relevant here.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists