lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 8 Feb 2024 14:18:06 +0100
From: "Wilczynski, Michal" <michal.wilczynski@...el.com>
To: Paolo Bonzini <pbonzini@...hat.com>, Yunhong Jiang
	<yunhong.jiang@...ux.intel.com>
CC: <seanjc@...gle.com>, <mlevitsk@...hat.com>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>,
	<x86@...nel.org>, <hpa@...or.com>, <kvm@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <dedekind1@...il.com>, <yuan.yao@...el.com>,
	Zheyu Ma <zheyuma97@...il.com>
Subject: Re: [PATCH v2] KVM: x86: nSVM/nVMX: Fix handling triple fault on RSM
 instruction



On 2/8/2024 11:31 AM, Paolo Bonzini wrote:
> On Thu, Jan 25, 2024 at 1:59 AM Yunhong Jiang
> <yunhong.jiang@...ux.intel.com> wrote:
>> Would it be ok to move the followed emulator_leave_smm() code into
>> vmx_leave_smm, before setting nested_run_pending bit? It avoids changing
>> the generic emulator code.
>>
>> #ifdef CONFIG_X86_64
>>         if (guest_cpuid_has(vcpu, X86_FEATURE_LM))
>>                 return rsm_load_state_64(ctxt, &smram.smram64);
>>         else
>> #endif
>>                 return rsm_load_state_32(ctxt, &smram.smram32);
> 
> I agree with Michal that vendor code should not be in charge of
> calling rsm_load_state_*.
> 
> However, I don't understand the problem or the fix.
> 
> The possible causes are two, and in neither case the fix is to clear
> nested_run_pending:
> 
> 1) if the problem is that setting nested_run_pending was premature,
> the correct fix in my opinion is to split the leave_smm vendor
> callback in "prepare" and "commit" parts; not to clear it later.  See
> the attached, untested, patch.

Hi, I've tested the patch and it seems to work, both on Intel and AMD.
There was a problem with applying this chunk though:

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index ac8b7614e79d..3d18fa7db353 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -119,7 +119,8 @@ KVM_X86_OP(setup_mce)
 #ifdef CONFIG_KVM_SMM
 KVM_X86_OP(smi_allowed)
 KVM_X86_OP()                 // <- This shouldn't be there I guess ?
-KVM_X86_OP(leave_smm)
+KVM_X86_OP(leave_smm_prepare)
+KVM_X86_OP(leave_smm_commit)
 KVM_X86_OP(enable_smi_window)
 #endif
 KVM_X86_OP_OPTIONAL(dev_get_attr)

Anyway I was a bit averse to this approach as I noticed in the git log
that callbacks like e.g post_leave_smm() used to exist, but they were later
removed, so I though the maintainers don't like introducing extra
callbacks. 

> 
> 2) otherwise, if the problem is that we have not gone through the
> vmenter yet, then KVM needs to do that and _then_ inject the triple
> fault. The fix is to merge the .triple_fault and .check_nested_events
> callbacks, with something like the second attached patch - which
> probably has so many problems that I haven't even tried to compile it.

Well, in this case if we know that RSM will fail it doesn't seem to me
like it make sense to run vmenter just do kill the VM anyway, this would
be more confusing.

I've made the fix this way based on our discussion with Sean in v1, and
tried to mark the RSM instruction with a flag, as a one that needs
actual HW VMenter to complete succesfully, and based on that information
manipulate nested_run_pending.

> 
> The first patch should be equivalent to yours, and I guess it is
> okay-ish with a few more comments that explain what's going on.

Sure, I'm fine with this. Thanks !

> 
> Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ