[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c09dd91f-c280-85a6-c2a2-d44a0d378bbc@redhat.com>
Date: Thu, 9 Apr 2020 16:32:37 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Andrew Cooper <andrew.cooper3@...rix.com>,
Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>
Cc: Sean Christopherson <sean.j.christopherson@...el.com>,
Vivek Goyal <vgoyal@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
kvm list <kvm@...r.kernel.org>, stable <stable@...r.kernel.org>
Subject: Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS
On 09/04/20 16:13, Andrew Cooper wrote:
> On 09/04/2020 13:47, Paolo Bonzini wrote:
>> On 09/04/20 06:50, Andy Lutomirski wrote:
>>> The small
>>> (or maybe small) one is that any fancy protocol where the guest
>>> returns from an exception by doing, logically:
>>>
>>> Hey I'm done; /* MOV somewhere, hypercall, MOV to CR4, whatever */
>>> IRET;
>>>
>>> is fundamentally racy. After we say we're done and before IRET, we
>>> can be recursively reentered. Hi, NMI!
>> That's possible in theory. In practice there would be only two levels
>> of nesting, one for the original page being loaded and one for the tail
>> of the #VE handler. The nested #VE would see IF=0, resolve the EPT
>> violation synchronously and both handlers would finish. For the tail
>> page to be swapped out again, leading to more nesting, the host's LRU
>> must be seriously messed up.
>>
>> With IST it would be much messier, and I haven't quite understood why
>> you believe the #VE handler should have an IST.
>
> Any interrupt/exception which can possibly occur between a SYSCALL and
> re-establishing a kernel stack (several instructions), must be IST to
> avoid taking said exception on a user stack and being a trivial
> privilege escalation.
Doh, of course. I always confuse SYSCALL and SYSENTER.
> Therefore, it doesn't really matter if KVM's paravirt use of #VE does
> respect the interrupt flag. It is not sensible to build a paravirt
> interface using #VE who's safety depends on never turning on
> hardware-induced #VE's.
No, I think we wouldn't use a paravirt #VE at this point, we would use
the real thing if available.
It would still be possible to switch from the IST to the main kernel
stack before writing 0 to the reentrancy word.
Paolo
Powered by blists - more mailing lists