linux-kernel - Re: [PATCH] KVM: SVM: Fix redundant updates of LBR MSR intercepts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ei6cdmnvhzyavfobamjkcq2ghdrxcv7ruxhcbzzycqlvaty7zr@5cjkfczxiqom>
Date: Fri, 14 Nov 2025 16:52:10 +0000
From: Yosry Ahmed <yosry.ahmed@...ux.dev>
To: Sean Christopherson <seanjc@...gle.com>, 
	Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	stable@...r.kernel.org
Subject: Re: [PATCH] KVM: SVM: Fix redundant updates of LBR MSR intercepts

On Fri, Nov 14, 2025 at 08:34:54AM -0800, Sean Christopherson wrote:
> On Wed, Nov 12, 2025, Yosry Ahmed wrote:
> > svm_update_lbrv() always updates LBR MSRs intercepts, even when they are
> > already set correctly. This results in force_msr_bitmap_recalc always
> > being set to true on every nested transition,
> 
> Nit, it's only on VMRUN, not on every transition (i.e. not on nested #VMEXIT).

How so? svm_update_lbrv() will also be called in nested_svm_vmexit(),
and it will eventually lead to force_msr_bitmap_recalc being set to
true.

I guess what you meant is the "undoing the Hyper-V optimization" part.
That is indeed only affected by the svm_update_lbrv() call in the nested
VMRUN path.

So we do set force_msr_bitmap_recalc on nested #VMEXIT, but it doesn't
really matter because we set it again on nested VMRUN before
nested_svm_merge_msrpm() is called.

> 
> > essentially undoing the hyperv optimization in nested_svm_merge_msrpm().
> 
> When something fixes a KVM test failures (selftests or KUT), please call that
> out in the changelog.  That way when other people encounter the failure, they'll
> get search hits and won't have to waste their time bisecting and/or debugging.
> 
> If you hadn't mentioned off-list that this was detected by hyperv_svm_test, I
> wouldn't have had the first clue as to why that test started failing.  Even with
> the hint, it still took me a few minutes to connect the dots.

Noted, makes sense. I thought the fix and the original patch are in such
quick succession that hopefully no one will run into it.

> 
> In general, be more explicit/detailed, e.g. "undoing the hyperv optimization" is
> unnecessarily vague, as the reader has to go look at the code to understand what
> you're talking about.  My philosophy with changelogs is that they are write-once,
> read-many, and so if you can save any time/effort for readers, it's almost always
> worth the extra time/effort on the "write" side.
> 
> And a nit: my strong preference is to lead with what is being changed, and then
> dive into the details of why, what's breaking, etc.  This is one of the few
> divergences from the tip-tree preferences.  From  Documentation/process/maintainer-kvm-x86.rst:
> 
>   Stating what a patch does before diving into details is preferred by KVM x86
>   for several reasons.  First and foremost, what code is actually being changed
>   is arguably the most important information, and so that info should be easy to
>   find. Changelogs that bury the "what's actually changing" in a one-liner after
>   3+ paragraphs of background make it very hard to find that information.

Noted.

> 
> E.g.
> 
> --
> Don't update the LBR MSR intercept bitmaps if they're already up-to-date,
> as unconditionally updating the intercepts forces KVM to recalculate the
> MSR bitmaps for vmcb02 on every nested VMRUN.  Functionally, the redundant
> updates are benign, but forcing an update neuters the Hyper-V optimization
> that allows KVM to skip refreshing the vmcb12 MSR bitmap if L1 marked the
> "nested enlightenments" as being clean, i.e. if L1 told KVM that no
> changes were made to the MSR bitmap since the last VMRUN.
> 
> Clobbering the Hyper-V optimization manifests as a failure in the
> hyperv_svm_test KVM selftest, which intentionally changes the MSR bitmap
> "without telling KVM about it" to verify that KVM honors the clean hint.
> 
>   ==== Test Assertion Failure ====
>   x86/hyperv_svm_test.c:120: vmcb->control.exit_code == 0x081
>   pid=193558 tid=193558 errno=4 - Interrupted system call
>      1	0x0000000000411361: assert_on_unhandled_exception at processor.c:659
>      2	0x0000000000406186: _vcpu_run at kvm_util.c:1699
>      3	 (inlined by) vcpu_run at kvm_util.c:1710
>      4	0x0000000000401f2a: main at hyperv_svm_test.c:175
>      5	0x000000000041d0d3: __libc_start_call_main at libc-start.o:?
>      6	0x000000000041f27c: __libc_start_main_impl at ??:?
>      7	0x00000000004021a0: _start at ??:?
>   vmcb->control.exit_code == SVM_EXIT_VMMCALL
> 
> Avoid using ....
> --  

Thanks!

> 
> > Fix it by keeping track of whether LBR MSRs are intercepted or not and
> > only doing the update if needed, similar to x2avic_msrs_intercepted.
> > 
> > Avoid using svm_test_msr_bitmap_*() to check the status of the
> > intercepts, as an arbitrary MSR will need to be chosen as a
> > representative of all LBR MSRs, and this could theoretically break if
> > some of the MSRs intercepts are handled differently from the rest.
> 
> For posterity, Yosry originally proposed (off-list) fixing this by having
> svm_set_intercept_for_msr() check for redundant updates, but I voted against
> that because updating MSR interception _should_ be rare (full CPUID updates and
> explicit MSR filter updates), and I don't want to risk hiding a bug/flaw elsewhere.
> I.e. if something is triggering frequent/unexpected MSR bitmap changes, I want
> that to be surfaced, not squashed/handled by the low level helpers.

Hmm, that was on-list though :P

https://lore.kernel.org/kvm/aRO5ItX_--ZDfnfM@google.com/

> 
>  
> > Also, using svm_test_msr_bitmap_*() makes backports difficult as it was
> > only recently introduced with no direct alternatives in older kernels.
> > 
> > Fixes: fbe5e5f030c2 ("KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()")
> > Cc: stable@...r.kernel.org
> > Signed-off-by: Yosry Ahmed <yosry.ahmed@...ux.dev>
> 
> With an updated changelog,
> 
> Reviewed-by: Sean Christopherson <seanjc@...gle.com>
> Tested-by: Sean Christopherson <seanjc@...gle.com>

Thanks!

Paolo, do you prefer a updated patch with the updated changelog, or
fixing it up when you apply it?