linux-kernel - Re: [PATCH 2/2] KVM: SVM: drop useless cpumask_test_cpu() in pre_sev

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aJppAp5tK7kPv8uj@google.com>
Date: Mon, 11 Aug 2025 15:04:50 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Yury Norov <yury.norov@...il.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org, 
	kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Zheyun Shen <szy0127@...u.edu.cn>
Subject: Re: [PATCH 2/2] KVM: SVM: drop useless cpumask_test_cpu() in pre_sev_run()

On Mon, Aug 11, 2025, Yury Norov wrote:
> On Mon, Aug 11, 2025 at 01:45:46PM -0700, Sean Christopherson wrote:
> > On Mon, Aug 11, 2025, Yury Norov wrote:
> > > Testing cpumask for a CPU to be cleared just before setting the exact
> > > same CPU is useless because the end result is always the same: CPU is
> > > set.
> > 
> > No, it is not useless.  Blindly writing to the variable will unnecessarily bounce
> > the cacheline, and this is a hot path.
> 
> How hot is that path?

Very, it gets hit on every VM-Exit => VM-Entry.  For context, putting a single
printk anywhere in KVM's exit=>entry path can completely prevent forward progress
in the guest (for some workloads/patterns).

> How bad the cache contention is?

I would expect it to be "fatally" bad for some workloads and setups.  Not literally
fatal, but bad enough that it would require an urgent fix.

> Is there any evidence that conditional cpumask_set_cpu() worth the effort?

I don't have evidence for this specific code flow, but there is plenty of evidence
that shows that generating atomic accesses, especially across sockets, can have a
significant negative impact on performance.

I didn't ask for performance numbers for optimizing setting the mask because (a)
I know the VM-Entry path can be extremely hot, (b) I know that dueling atomics
can be hugely problematic, and (c) I don't see the separate test + set logic as
being at all notable in terms of effort.

> The original patch doesn't discuss that at all, and without any comment the
> code looks just buggy.

FWIW, there was discussion in a previous version of the series, but no hard
numbers on the perf impact.

https://lore.kernel.org/all/Z75se_OZQvaeQE-4@google.com

> 
> > > While there, switch CPU setter to a non-atomic version. Atomicity is
> > > useless here 
> > 
> > No, atomicity isn't useless here either.  Dropping atomicity could result in
> > CPU's bit being lost.  I.e. the atomic accesses aren't for the benefit of
> > smp_call_function_many_cond(), the writes are atomic so that multiple vCPUs can
> > concurrently update the mask without needing additional protection.
> 
> OK, I see. Something heavy hit my head before I decided to drop
> atomicity there.
> 
> > > because sev_writeback_caches() ends up with a plain
> > > for_each_cpu() loop in smp_call_function_many_cond(), which is not
> > > atomic by nature.
> > 
> > That's fine.  As noted in sev_writeback_caches(), if vCPU could be running, then
> > the caller is responsible for ensuring that all vCPUs flush caches before the
> > memory being reclaimed is fully freed.  Those guarantees are provided by KVM's
> > MMU.
> > 
> > sev_writeback_caches() => smp_call_function_many_cond() could hit false positives,
> > i.e. trigger WBINVD on CPUs that couldn't possibly have accessed the memory being
> > reclaimed, but such false positives are functionally benign, and are "intended"
> > in the sense that we chose to prioritize simplicity over precision.
> 
> So, I don't object to drop the patch, but it would be really nice to
> have this 
>                         if (!cpumask_test_cpu())
>                                 cpumask_set_cpu()
> 
> pattern explained, and even better supported with performance numbers.

I can definitely add a comment, and I might try to gather numbers out of curiosity,
but as above, I just don't see this as something that needs to be investigated with
any urgency.