lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <msxw2bnkbbhnk6cpzs36vs4gww6r2on25twxpridybcqiyb4b5@5i2mblpty3fa>
Date: Mon, 5 Jan 2026 19:42:24 +0000
From: Yosry Ahmed <yosry.ahmed@...ux.dev>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, 
	Andrew Jones <andrew.jones@...ux.dev>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [kvm-unit-tests PATCH] x86: Increase the timeout for
 vmx_pf_{vpid/no_vpid/invvpid}_test

On Mon, Jan 05, 2026 at 11:19:21AM -0800, Sean Christopherson wrote:
> On Mon, Jan 05, 2026, Yosry Ahmed wrote:
> > On Mon, Jan 05, 2026 at 09:54:13AM -0800, Sean Christopherson wrote:
> > > On Fri, Jan 02, 2026, Yosry Ahmed wrote:
> > > > When running the tests on some older CPUs (e.g. Skylake) on a kernel
> > > > with some debug config options enabled (e.g. CONFIG_DEBUG_VM,
> > > > CONFIG_PROVE_LOCKING, ..), the tests timeout. In this specific setup,
> > > > the tests take between 4 and 5 minutes, so pump the timeout from 4 to 6
> > > > minutes.
> > > 
> > > Ugh.  Can anyone think of a not-insane way to skip these tests when running in
> > > an environment that is going to be sloooooow?  Because (a) a 6 minute timeout
> > > could very well hide _real_ KVM bugs, e.g. if is being too aggressive with TLB
> > > flushes (speaking from experience) and (b) running a 5+ minute test is a likely
> > > a waste of time/resources.
> > 
> > The definition of a slow enviroment is also very dynamic, I don't think
> > we want to play whack-a-mole with config options or runtime knobs that
> > would make the tests slow.
> > 
> > I don't like just increasing the timeout either, but the tests are slow
> > even without these specific config options. They only make them a little
> > bit slower, enough to consistently reproduce the timeout.
> 
> Heh, "little bit" is also subjective.  The tests _can_ run in less than 10
> seconds:
> 
> $ time qemu --no-reboot -nodefaults -global kvm-pit.lost_tick_policy=discard
>   -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none
>   -serial stdio -device pci-testdev -machine accel=kvm,kernel_irqchip=split
>   -kernel x86/vmx.flat -smp 1 -append vmx_pf_invvpid_test -cpu max,+vmx
> 
> 933897 tests, 0 failures
> PASS: 4-level paging tests
> filter = vmx_pf_invvpid_test, test = vmx_pf_vpid_test
> filter = vmx_pf_invvpid_test, test = vmx_exception_test
> filter = vmx_pf_invvpid_test, test = vmx_canonical_test
> filter = vmx_pf_invvpid_test, test = vmx_cet_test
> SUMMARY: 1867887 tests
> Command exited with non-zero status 1
> 3.69user 3.19system 0:06.90elapsed 99%CPU
> 
> > This is also acknowledged by commit ca785dae0dd3 ("vmx: separate VPID
> > tests"), which introduced the separate targets to increase the timeout.
> > It mentions the 3 tests taking 12m (so roughly 4m each). 
> 
> Because of debug kernels.  With a fully capable host+KVM and non-debug kernel,
> the tests take ~50 seconds each.
> 
> Looking at why the tests can run in ~7 seconds, the key difference is that the
> above run was done with ept=0, which culls the Protection Keys tests (KVM doesn't
> support PKU when using shadow paging because it'd be insane to emulate correctly).
> The PKU testcases increase the total number of testcases by 10x, which leads to
> timeouts with debug kernels.
> 
> Rather than run with a rather absurd timeout, what if we disable PKU in the guest
> for the tests?  Running all four tests completes in <20 seconds:

This looks good. On the Icelake machine they took around 1m 24s, and I
suspect they will take a bit longer with all the debug options, so we'll
still need a longer timeout than the default 90s (maybe 120s or 180s).

Alternatively, we can keep the targets separate if we want to keep the
default timeout.

> 
> $ time qemu --no-reboot -nodefaults -global kvm-pit.lost_tick_policy=discard
>   -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none
>   -serial stdio -device pci-testdev -machine accel=kvm,kernel_irqchip=split
>   -kernel x86/vmx.flat -smp 1 -append "vmx_pf_exception_forced_emulation_test
>   vmx_pf_vpid_test vmx_pf_invvpid_test vmx_pf_no_vpid_test" -cpu max,+vmx,-pku
> 
> 10.40user 7.28system 0:17.76elapsed 99%CPU (0avgtext+0avgdata 79788maxresident)
> 
> That way we can probably/hopefully bundle the configs together, and enable it by
> default:

If you post the patch below feel free to add:

Tested-by: Yosry Ahmed <yosry.ahmed@...ux.dev>

Small comment below:

> 
> diff --git a/x86/unittests.cfg b/x86/unittests.cfg
> index 522318d3..45f25f51 100644
> --- a/x86/unittests.cfg
> +++ b/x86/unittests.cfg
> @@ -413,37 +413,16 @@ qemu_params = -cpu max,+vmx
>  arch = x86_64
>  groups = vmx nested_exception
>  
> -[vmx_pf_exception_test_fep]
> +[vmx_pf_exception_test_emulated]

The name is a bit confusing because vmx_pf_vpid_test,
vmx_pf_invvpid_test, and vmx_pf_no_vpid_test do not use FEP like
vmx_pf_exception_forced_emulation_test does. We do emulate the TLB
flush, but I guess the word "emulation" means slightly different things
for different tagets.

>  file = vmx.flat
> -test_args = "vmx_pf_exception_forced_emulation_test"
> -qemu_params = -cpu max,+vmx
> +test_args = "vmx_pf_exception_forced_emulation_test vmx_pf_vpid_test vmx_pf_invvpid_test vmx_pf_no_vpid_test"
> +# Disable Protection Keys for the VMX #PF tests that require KVM to emulate one
> +# or more instructions per testcase, as PKU increases the number of testcases
> +# by an order of magnitude, and testing PKU for these specific tests isn't all
> +# that interesting.
> +qemu_params = -cpu max,+vmx,-pku
>  arch = x86_64
> -groups = vmx nested_exception nodefault
> -timeout = 240
> -
> -[vmx_pf_vpid_test]
> -file = vmx.flat
> -test_args = "vmx_pf_vpid_test"
> -qemu_params = -cpu max,+vmx
> -arch = x86_64
> -groups = vmx nested_exception nodefault
> -timeout = 240
> -
> -[vmx_pf_invvpid_test]
> -file = vmx.flat
> -test_args = "vmx_pf_invvpid_test"
> -qemu_params = -cpu max,+vmx
> -arch = x86_64
> -groups = vmx nested_exception nodefault
> -timeout = 240
> -
> -[vmx_pf_no_vpid_test]
> -file = vmx.flat
> -test_args = "vmx_pf_no_vpid_test"
> -qemu_params = -cpu max,+vmx
> -arch = x86_64
> -groups = vmx nested_exception nodefault
> -timeout = 240
> +groups = vmx nested_exception
>  
>  [vmx_pf_exception_test_reduced_maxphyaddr]
>  file = vmx.flat
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ