lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240904210830.GA1229985@thelio-3990X>
Date: Wed, 4 Sep 2024 14:08:30 -0700
From: Nathan Chancellor <nathan@...nel.org>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org, Chao Gao <chao.gao@...el.com>,
	Zeng Guang <guang.zeng@...el.com>
Subject: Re: [PATCH 1/6] KVM: nVMX: Get to-be-acknowledge IRQ for nested
 VM-Exit at injection site

Hi Sean,

On Fri, Jul 19, 2024 at 05:01:33PM -0700, Sean Christopherson wrote:
> Move the logic to get the to-be-acknowledge IRQ for a nested VM-Exit from
> nested_vmx_vmexit() to vmx_check_nested_events(), which is subtly the one
> and only path where KVM invokes nested_vmx_vmexit() with
> EXIT_REASON_EXTERNAL_INTERRUPT.  A future fix will perform a last-minute
> check on L2's nested posted interrupt notification vector, just before
> injecting a nested VM-Exit.  To handle that scenario correctly, KVM needs
> to get the interrupt _before_ injecting VM-Exit, as simply querying the
> highest priority interrupt, via kvm_cpu_has_interrupt(), would result in
> TOCTOU bug, as a new, higher priority interrupt could arrive between
> kvm_cpu_has_interrupt() and kvm_cpu_get_interrupt().
> 
> Opportunistically convert the WARN_ON() to a WARN_ON_ONCE().  If KVM has
> a bug that results in a false positive from kvm_cpu_has_interrupt(),
> spamming dmesg won't help the situation.
> 
> Note, nested_vmx_reflect_vmexit() can never reflect external interrupts as
> they are always "wanted" by L0.
> 
> Cc: stable@...r.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 25 ++++++++++++++++---------
>  1 file changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 2392a7ef254d..b3e17635f7e3 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4284,11 +4284,26 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
>  	}
>  
>  	if (kvm_cpu_has_interrupt(vcpu) && !vmx_interrupt_blocked(vcpu)) {
> +		u32 exit_intr_info;
> +
>  		if (block_nested_events)
>  			return -EBUSY;
>  		if (!nested_exit_on_intr(vcpu))
>  			goto no_vmexit;
> -		nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
> +
> +		if (nested_exit_intr_ack_set(vcpu)) {
> +			int irq;
> +
> +			irq = kvm_cpu_get_interrupt(vcpu);
> +			WARN_ON_ONCE(irq < 0);
> +
> +			exit_intr_info = INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR | irq;
> +		} else {
> +			exit_intr_info = 0;
> +		}
> +
> +		nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT,
> +				  exit_intr_info, 0);
>  		return 0;
>  	}
>  
> @@ -4969,14 +4984,6 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
>  	vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>  
>  	if (likely(!vmx->fail)) {
> -		if ((u16)vm_exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
> -		    nested_exit_intr_ack_set(vcpu)) {
> -			int irq = kvm_cpu_get_interrupt(vcpu);
> -			WARN_ON(irq < 0);
> -			vmcs12->vm_exit_intr_info = irq |
> -				INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
> -		}
> -
>  		if (vm_exit_reason != -1)
>  			trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
>  						       vmcs12->exit_qualification,
> -- 
> 2.45.2.1089.g2a221341d9-goog
> 

I bisected (log below) an issue with starting a nested guest that
appears on two of my newer Intel test machines (but not a somewhat old
laptop) when this change as commit 6f373f4d941b ("KVM: nVMX: Get
to-be-acknowledge IRQ for nested VM-Exit at injection site") in -next is
present in the host kernel.

I start a virtual machine with a full distribution using QEMU then start
a nested virtual machine using QEMU with the same kernel and a much
simpler Buildroot initrd, just to test the ability to run a nested
guest. After this change, starting a nested guest results in no output
from the nested guest and eventually the first guest restarts, sometimes
printing a lockup message that appears to be caused from qemu-system-x86

My command for the first guest on the host (in case it matters):

  $ qemu-system-x86_64 \
      -display none \
      -serial mon:stdio \
      -nic user,model=virtio-net-pci,hostfwd=tcp::8022-:22 \
      -object rng-random,filename=/dev/urandom,id=rng0 \
      -device virtio-rng-pci \
      -chardev socket,id=char0,path=$VM_FOLDER/x86_64/arch/vfsd.sock \
      -device vhost-user-fs-pci,queue-size=1024,chardev=char0,tag=host \
      -object memory-backend-memfd,id=mem,share=on,size=16G \
      -numa node,memdev=mem \
      -m 16G \
      -device virtio-balloon \
      -smp 8 \
      -drive if=pflash,format=raw,file=$VM_FOLDER/x86_64/arch/efi.img,readonly=on \
      -drive if=pflash,format=raw,file=$VM_FOLDER/x86_64/arch/efi_vars.img \
      -cpu host \
      -enable-kvm \
      -M q35 \
      -drive if=virtio,format=qcow2,file=$VM_FOLDER/x86_64/arch/disk.img

My commands in the first guest to spawn the nested guest:

  $ cd $(mktemp -d)

  $ curl -LSs https://archive.archlinux.org/packages/l/linux/linux-6.10.8.arch1-1-x86_64.pkg.tar.zst | tar --zstd -xf -

  $ curl -LSs https://github.com/ClangBuiltLinux/boot-utils/releases/download/20230707-182910/x86_64-rootfs.cpio.zst | zstd -d >rootfs.cpio

  $ qemu-system-x86_64 \
      -display none \
      -nodefaults \
      -M q35 \
      -d unimp,guest_errors \
      -append 'console=ttyS0 earlycon=uart8250,io,0x3f8 loglevel=7' \
      -kernel usr/lib/modules/6.10.8-arch1-1/vmlinuz \
      -initrd rootfs.cpio \
      -cpu host \
      -enable-kvm \
      -m 512m \
      -smp 8 \
      -serial mon:stdio

If there is any additional information I can provide or patches I can
test, I am more than happy to do so.

Cheers,
Nathan

# bad: [6804f0edbe7747774e6ae60f20cec4ee3ad7c187] Add linux-next specific files for 20240903
# good: [67784a74e258a467225f0e68335df77acd67b7ab] Merge tag 'ata-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
git bisect start '6804f0edbe7747774e6ae60f20cec4ee3ad7c187' '67784a74e258a467225f0e68335df77acd67b7ab'
# good: [6b63f16410fa86f6a2e9f52c9cb52ba94c363f3e] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git
git bisect good 6b63f16410fa86f6a2e9f52c9cb52ba94c363f3e
# good: [194eaf7dd63eef7cee974daeb4df01a3e6b144fe] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply.git
git bisect good 194eaf7dd63eef7cee974daeb4df01a3e6b144fe
# bad: [a8f65643f59dac67451d09ff298fa7f6e7917794] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt.git
git bisect bad a8f65643f59dac67451d09ff298fa7f6e7917794
# good: [f80eff5b9f33c4f8d86ba046f3ee54c4f2224454] Merge branch 'timers/drivers/next' of https://git.linaro.org/people/daniel.lezcano/linux.git
git bisect good f80eff5b9f33c4f8d86ba046f3ee54c4f2224454
# bad: [a93e40d038ccd17e6cf691e1b8fec8821998baf2] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git
git bisect bad a93e40d038ccd17e6cf691e1b8fec8821998baf2
# good: [500b6c92524183f97e3a3c8e6c56f8af69429ba4] Merge branch 'non-rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
git bisect good 500b6c92524183f97e3a3c8e6c56f8af69429ba4
# bad: [642613182efa0927c8bd4d4ef2c6b8350554b6ad] Merge branches 'fixes', 'generic', 'misc', 'mmu', 'pat_vmx_msrs', 'selftests', 'svm' and 'vmx'
git bisect bad 642613182efa0927c8bd4d4ef2c6b8350554b6ad
# good: [1876dd69dfe8c29e249070b0e4dc941fc15ac1e4] KVM: x86: Add fastpath handling of HLT VM-Exits
git bisect good 1876dd69dfe8c29e249070b0e4dc941fc15ac1e4
# bad: [44518120c4ca569cfb9c6199e94c312458dc1c07] KVM: nVMX: Detect nested posted interrupt NV at nested VM-Exit injection
git bisect bad 44518120c4ca569cfb9c6199e94c312458dc1c07
# good: [2ab637df5f68d4e0cfa9becd10f43400aee785b3] KVM: VMX: hyper-v: Prevent impossible NULL pointer dereference in evmcs_load()
git bisect good 2ab637df5f68d4e0cfa9becd10f43400aee785b3
# bad: [f729851189d5741e7d1059e250422611028657f8] KVM: x86: Don't move VMX's nested PI notification vector from IRR to ISR
git bisect bad f729851189d5741e7d1059e250422611028657f8
# bad: [cb14e454add0efc9bc461c1ad30d9c950c406fab] KVM: nVMX: Suppress external interrupt VM-Exit injection if there's no IRQ
git bisect bad cb14e454add0efc9bc461c1ad30d9c950c406fab
# bad: [6f373f4d941bf79f06e9abd4a827288ad1de6399] KVM: nVMX: Get to-be-acknowledge IRQ for nested VM-Exit at injection site
git bisect bad 6f373f4d941bf79f06e9abd4a827288ad1de6399
# first bad commit: [6f373f4d941bf79f06e9abd4a827288ad1de6399] KVM: nVMX: Get to-be-acknowledge IRQ for nested VM-Exit at injection site

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ