[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200227205153.GC17014@linux.intel.com>
Date: Thu, 27 Feb 2020 12:51:53 -0800
From: Sean Christopherson <sean.j.christopherson@...el.com>
To: Krish Sadhukhan <krish.sadhukhan@...cle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Xiaoyao Li <xiaoyao.li@...el.com>
Subject: Re: [PATCH] KVM: nVMX: Consult only the "basic" exit reason when
routing nested exit
On Thu, Feb 27, 2020 at 12:08:55PM -0800, Krish Sadhukhan wrote:
>
> On 2/27/20 9:44 AM, Sean Christopherson wrote:
> >Consult only the basic exit reason, i.e. bits 15:0 of vmcs.EXIT_REASON,
> >when determining whether a nested VM-Exit should be reflected into L1 or
> >handled by KVM in L0.
> >
> >For better or worse, the switch statement in nested_vmx_exit_reflected()
> >currently defaults to "true", i.e. reflects any nested VM-Exit without
> >dedicated logic. Because the case statements only contain the basic
> >exit reason, any VM-Exit with modifier bits set will be reflected to L1,
> >even if KVM intended to handle it in L0.
> >
> >Practically speaking, this only affects EXIT_REASON_MCE_DURING_VMENTRY,
> >i.e. a #MC that occurs on nested VM-Enter would be incorrectly routed to
> >L1, as "failed VM-Entry" is the only modifier that KVM can currently
> >encounter. The SMM modifiers will never be generated as KVM doesn't
> >support/employ a SMI Transfer Monitor. Ditto for "exit from enclave",
> >as KVM doesn't yet support virtualizing SGX, i.e. it's impossible to
> >enter an enclave in a KVM guest (L1 or L2).
>
>
> It seems nested_vmx_exit_reflected() deals only with the basic exit reason.
> If it doesn't need anything beyond bits 15:0, may be vmx_handle_exit() can
> pass just the base exit reason ?
Argh. I was going to simply respond with "It traces exit_reason via
trace_kvm_nested_vmexit().", but then I looked at the tracing code :-(
The tracepoints that print the names of the VM-Exit are flawed in the sense
that they'll always print the raw value for VM-Exits with modifiers. E.g.
a consistency check VM-Exit on invalid guest state will print 0x80000021
instead of INVALID_STATE.
Stripping bits 31:16 when invoking the tracepoint would fix the immediate
issue, but I'm not sure I like that approach because doing so drops
information that could potentially be quite helpful, e.g. if nested VM-Exit
injection injected EXIT_REASON_MSR_LOAD_FAIL without also setting
VMX_EXIT_REASONS_FAILED_VMENTRY, which could break/confuse the L1 VMM.
I'm also not remotely confident that we won't screw this up again in the
future :-)
So part of me thinks the best way to resolve the printing would be to
modify VMX_EXIT_REASONS to do "| VMX_EXIT_REASONS_FAILED_VMENTRY" where
appropriate, i.e. on INVALID_STATE, MSR_LOAD_FAIL and MCE_DURING_VMENTRY.
The downside of that approach is it breaks again when new modifiers come
along, e.g. SGX's ENCLAVE_EXIT. But again, the modifier is likely useful
information.
I think the most foolproof and informative way to handle this would be to
add a macro and/or helper function, e.g. kvm_print_vmx_exit_reason(), to
wrap __print_symbolic(__entry->exit_code, VMX_EXIT_REASONS) so that it
prints both the name of the basic exit reason as well as the names for
any modifiers.
TL;DR: I still like this patch as is, especially since it'll be easy to
backport. I'll send a separate patch for the tracepoint issue.
Powered by blists - more mailing lists