[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <A1EB80C0-2D88-4DC0-A898-3BED50A4F5A8@amacapital.net>
Date: Tue, 14 May 2019 14:55:18 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Sean Christopherson <sean.j.christopherson@...el.com>
Cc: Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Alexandre Chartre <alexandre.chartre@...cle.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Radim Krcmar <rkrcmar@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
kvm list <kvm@...r.kernel.org>, X86 ML <x86@...nel.org>,
Linux-MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
jan.setjeeilers@...cle.com, Liran Alon <liran.alon@...cle.com>,
Jonathan Adams <jwadams@...gle.com>
Subject: Re: [RFC KVM 18/27] kvm/isolation: function to copy page table entries for percpu buffer
> On May 14, 2019, at 2:06 PM, Sean Christopherson <sean.j.christopherson@...el.com> wrote:
>
>> On Tue, May 14, 2019 at 01:33:21PM -0700, Andy Lutomirski wrote:
>> On Tue, May 14, 2019 at 11:09 AM Sean Christopherson
>> <sean.j.christopherson@...el.com> wrote:
>>> For IRQs it's somewhat feasible, but not for NMIs since NMIs are unblocked
>>> on VMX immediately after VM-Exit, i.e. there's no way to prevent an NMI
>>> from occuring while KVM's page tables are loaded.
>>>
>>> Back to Andy's question about enabling IRQs, the answer is "it depends".
>>> Exits due to INTR, NMI and #MC are considered high priority and are
>>> serviced before re-enabling IRQs and preemption[1]. All other exits are
>>> handled after IRQs and preemption are re-enabled.
>>>
>>> A decent number of exit handlers are quite short, e.g. CPUID, most RDMSR
>>> and WRMSR, any event-related exit, etc... But many exit handlers require
>>> significantly longer flows, e.g. EPT violations (page faults) and anything
>>> that requires extensive emulation, e.g. nested VMX. In short, leaving
>>> IRQs disabled across all exits is not practical.
>>>
>>> Before going down the path of figuring out how to handle the corner cases
>>> regarding kvm_mm, I think it makes sense to pinpoint exactly what exits
>>> are a) in the hot path for the use case (configuration) and b) can be
>>> handled fast enough that they can run with IRQs disabled. Generating that
>>> list might allow us to tightly bound the contents of kvm_mm and sidestep
>>> many of the corner cases, i.e. select VM-Exits are handle with IRQs
>>> disabled using KVM's mm, while "slow" VM-Exits go through the full context
>>> switch.
>>
>> I suspect that the context switch is a bit of a red herring. A
>> PCID-don't-flush CR3 write is IIRC under 300 cycles. Sure, it's slow,
>> but it's probably minor compared to the full cost of the vm exit. The
>> pain point is kicking the sibling thread.
>
> Speaking of PCIDs, a separate mm for KVM would mean consuming another
> ASID, which isn't good.
I’m not sure we care. We have many logical address spaces (two per mm plus a few more). We have 4096 PCIDs, but we only use ten or so. And we have some undocumented number of *physical* ASIDs with some undocumented mechanism by which PCID maps to a physical ASID.
I don’t suppose you know how many physical ASIDs we have? And how it interacts with the VPID stuff?
Powered by blists - more mailing lists