linux-kernel - Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0b632fb1-b662-89bf-2b95-6888bd64b3a9@citrix.com>
Date:   Thu, 9 Apr 2020 15:13:41 +0100
From:   Andrew Cooper <andrew.cooper3@...rix.com>
To:     Paolo Bonzini <pbonzini@...hat.com>,
        Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>
CC:     Sean Christopherson <sean.j.christopherson@...el.com>,
        Vivek Goyal <vgoyal@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        kvm list <kvm@...r.kernel.org>, stable <stable@...r.kernel.org>
Subject: Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS

On 09/04/2020 13:47, Paolo Bonzini wrote:
> On 09/04/20 06:50, Andy Lutomirski wrote:
>> The small
>> (or maybe small) one is that any fancy protocol where the guest
>> returns from an exception by doing, logically:
>>
>> Hey I'm done;  /* MOV somewhere, hypercall, MOV to CR4, whatever */
>> IRET;
>>
>> is fundamentally racy.  After we say we're done and before IRET, we
>> can be recursively reentered.  Hi, NMI!
> That's possible in theory.  In practice there would be only two levels
> of nesting, one for the original page being loaded and one for the tail
> of the #VE handler.  The nested #VE would see IF=0, resolve the EPT
> violation synchronously and both handlers would finish.  For the tail
> page to be swapped out again, leading to more nesting, the host's LRU
> must be seriously messed up.
>
> With IST it would be much messier, and I haven't quite understood why
> you believe the #VE handler should have an IST.

Any interrupt/exception which can possibly occur between a SYSCALL and
re-establishing a kernel stack (several instructions), must be IST to
avoid taking said exception on a user stack and being a trivial
privilege escalation.

In terms of using #VE in its architecturally-expected way, this can
occur in general before the kernel stack is established, so must be IST
for safety.

Therefore, it doesn't really matter if KVM's paravirt use of #VE does
respect the interrupt flag.  It is not sensible to build a paravirt
interface using #VE who's safety depends on never turning on
hardware-induced #VE's.

~Andrew