linux-kernel - Re: [RFC KVM 18/27] kvm/isolation: function to copy page table entries for percpu buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <A1EB80C0-2D88-4DC0-A898-3BED50A4F5A8@amacapital.net>
Date:   Tue, 14 May 2019 14:55:18 -0700
From:   Andy Lutomirski <luto@...capital.net>
To:     Sean Christopherson <sean.j.christopherson@...el.com>
Cc:     Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Alexandre Chartre <alexandre.chartre@...cle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krcmar <rkrcmar@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        kvm list <kvm@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Linux-MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
        jan.setjeeilers@...cle.com, Liran Alon <liran.alon@...cle.com>,
        Jonathan Adams <jwadams@...gle.com>
Subject: Re: [RFC KVM 18/27] kvm/isolation: function to copy page table entries for percpu buffer



> On May 14, 2019, at 2:06 PM, Sean Christopherson <sean.j.christopherson@...el.com> wrote:
> 
>> On Tue, May 14, 2019 at 01:33:21PM -0700, Andy Lutomirski wrote:
>> On Tue, May 14, 2019 at 11:09 AM Sean Christopherson
>> <sean.j.christopherson@...el.com> wrote:
>>> For IRQs it's somewhat feasible, but not for NMIs since NMIs are unblocked
>>> on VMX immediately after VM-Exit, i.e. there's no way to prevent an NMI
>>> from occuring while KVM's page tables are loaded.
>>> 
>>> Back to Andy's question about enabling IRQs, the answer is "it depends".
>>> Exits due to INTR, NMI and #MC are considered high priority and are
>>> serviced before re-enabling IRQs and preemption[1].  All other exits are
>>> handled after IRQs and preemption are re-enabled.
>>> 
>>> A decent number of exit handlers are quite short, e.g. CPUID, most RDMSR
>>> and WRMSR, any event-related exit, etc...  But many exit handlers require
>>> significantly longer flows, e.g. EPT violations (page faults) and anything
>>> that requires extensive emulation, e.g. nested VMX.  In short, leaving
>>> IRQs disabled across all exits is not practical.
>>> 
>>> Before going down the path of figuring out how to handle the corner cases
>>> regarding kvm_mm, I think it makes sense to pinpoint exactly what exits
>>> are a) in the hot path for the use case (configuration) and b) can be
>>> handled fast enough that they can run with IRQs disabled.  Generating that
>>> list might allow us to tightly bound the contents of kvm_mm and sidestep
>>> many of the corner cases, i.e. select VM-Exits are handle with IRQs
>>> disabled using KVM's mm, while "slow" VM-Exits go through the full context
>>> switch.
>> 
>> I suspect that the context switch is a bit of a red herring.  A
>> PCID-don't-flush CR3 write is IIRC under 300 cycles.  Sure, it's slow,
>> but it's probably minor compared to the full cost of the vm exit.  The
>> pain point is kicking the sibling thread.
> 
> Speaking of PCIDs, a separate mm for KVM would mean consuming another
> ASID, which isn't good.

I’m not sure we care. We have many logical address spaces (two per mm plus a few more).  We have 4096 PCIDs, but we only use ten or so.  And we have some undocumented number of *physical* ASIDs with some undocumented mechanism by which PCID maps to a physical ASID.

I don’t suppose you know how many physical ASIDs we have?  And how it interacts with the VPID stuff?