lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aJpgZeC8SEHfQ0EY@yury>
Date: Mon, 11 Aug 2025 17:28:05 -0400
From: Yury Norov <yury.norov@...il.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
	kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
	Zheyun Shen <szy0127@...u.edu.cn>
Subject: Re: [PATCH 2/2] KVM: SVM: drop useless cpumask_test_cpu() in
 pre_sev_run()

On Mon, Aug 11, 2025 at 01:45:46PM -0700, Sean Christopherson wrote:
> On Mon, Aug 11, 2025, Yury Norov wrote:
> > Testing cpumask for a CPU to be cleared just before setting the exact
> > same CPU is useless because the end result is always the same: CPU is
> > set.
> 
> No, it is not useless.  Blindly writing to the variable will unnecessarily bounce
> the cacheline, and this is a hot path.

How hot is that path? How bad the cache contention is? Is there any evidence
that conditional cpumask_set_cpu() worth the effort? The original patch
doesn't discuss that at all, and without any comment the code looks just
buggy.

> > While there, switch CPU setter to a non-atomic version. Atomicity is
> > useless here 
> 
> No, atomicity isn't useless here either.  Dropping atomicity could result in
> CPU's bit being lost.  I.e. the atomic accesses aren't for the benefit of
> smp_call_function_many_cond(), the writes are atomic so that multiple vCPUs can
> concurrently update the mask without needing additional protection.

OK, I see. Something heavy hit my head before I decided to drop
atomicity there.

> > because sev_writeback_caches() ends up with a plain
> > for_each_cpu() loop in smp_call_function_many_cond(), which is not
> > atomic by nature.
> 
> That's fine.  As noted in sev_writeback_caches(), if vCPU could be running, then
> the caller is responsible for ensuring that all vCPUs flush caches before the
> memory being reclaimed is fully freed.  Those guarantees are provided by KVM's
> MMU.
> 
> sev_writeback_caches() => smp_call_function_many_cond() could hit false positives,
> i.e. trigger WBINVD on CPUs that couldn't possibly have accessed the memory being
> reclaimed, but such false positives are functionally benign, and are "intended"
> in the sense that we chose to prioritize simplicity over precision.

So, I don't object to drop the patch, but it would be really nice to
have this 
                        if (!cpumask_test_cpu())
                                cpumask_set_cpu()

pattern explained, and even better supported with performance numbers.


Thanks,
Yury

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ