linux-kernel - Re: [PATCHv2 4/4] arm64: add host pv-vcpu-state support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87fsw82frw.wl-maz@kernel.org>
Date:   Wed, 21 Jul 2021 10:10:59 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     Sergey Senozhatsky <senozhatsky@...omium.org>
Cc:     Will Deacon <will@...nel.org>,
        Suleiman Souhlal <suleiman@...gle.com>,
        Joel Fernandes <joelaf@...gle.com>,
        linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
        linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org
Subject: Re: [PATCHv2 4/4] arm64: add host pv-vcpu-state support

On Wed, 21 Jul 2021 02:15:47 +0100,
Sergey Senozhatsky <senozhatsky@...omium.org> wrote:
> 
> On (21/07/12 17:24), Marc Zyngier wrote:
> > >  
> > >  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > >  {
> > > +	kvm_update_vcpu_preempted(vcpu, true);
> > 
> > This doesn't look right. With this, you are now telling the guest that
> > a vcpu that is blocked on WFI is preempted. This really isn't the
> > case, as it has voluntarily entered a low-power mode while waiting for
> > an interrupt. Indeed, the vcpu isn't running. A physical CPU wouldn't
> > be running either.
> 
> I suppose you are talking about kvm_vcpu_block().

kvm_vcpu_block() is how things are implemented. WFI is the instruction
I'm concerned about.

> Well, it checks kvm_vcpu_check_block() but then it simply schedule()
> out the vcpu process, which does look like "the vcpu is
> preempted". Once we sched_in() that vcpu process again we mark it as
> non-preempted, even though it remains in kvm wfx handler. Why isn't
> it right?

Because the vcpu hasn't been "preempted". It has *voluntarily* gone
into a low-power mode, and how KVM implements this "low-power mode" is
none of the guest's business. This is exactly the same behaviour that
you will have on bare metal. From a Linux guest perspective, the vcpu
is *idle*, not doing anything, and only waiting for an interrupt to
start executing again.

This is a fundamentally different concept from preempting a vcpu
because its time-slice is up. In this second case, you can indeed
mitigate things by exposing steal time and preemption status as you
break the illusion of a machine that is completely controlled by the
guest.

If the "reched on interrupt delivery while blocked on WFI" is too slow
for you, then *that* is the thing that needs addressing. Feeding extra
state to the guest doesn't help.

> Another call path is iret:
> 
> <iret>
> __schedule()
>  context_switch()
>   prepare_task_switch()
>    fire_sched_in_preempt_notifiers()
>     kvm_sched_out()
>      kvm_arch_vcpu_put()

I'm not sure how a x86 concept is relevant here.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.