linux-kernel - Re: [PATCH v4 04/11] x86/bhi: Make clear_bhb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251204014026.v5huyriswsqu3jat@desk>
Date: Wed, 3 Dec 2025 17:40:26 -0800
From: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
To: david laight <david.laight@...box.com>
Cc: Dave Hansen <dave.hansen@...el.com>,
	Nikolay Borisov <nik.borisov@...e.com>, x86@...nel.org,
	David Kaplan <david.kaplan@....com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Sean Christopherson <seanjc@...gle.com>,
	Paolo Bonzini <pbonzini@...hat.com>, Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	Asit Mallick <asit.k.mallick@...el.com>,
	Tao Zhang <tao1.zhang@...el.com>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v4 04/11] x86/bhi: Make clear_bhb_loop() effective on
 newer CPUs

On Tue, Nov 25, 2025 at 11:34:07AM +0000, david laight wrote:
> On Mon, 24 Nov 2025 11:31:26 -0800
> Pawan Gupta <pawan.kumar.gupta@...ux.intel.com> wrote:
> 
> > On Sat, Nov 22, 2025 at 11:05:58AM +0000, david laight wrote:
> ...
> > > For subtle reasons one of the mitigations that slows kernel entry caused
> > > a doubling of the execution time of a largely single-threaded task that
> > > spends almost all its time in userspace!
> > > (I thought I'd disabled it at compile time - but the config option
> > > changed underneath me...)  
> > 
> > That is surprising. If its okay, could you please share more details about
> > this application? Or any other way I can reproduce this?
> 
> The 'trigger' program is a multi-threaded program that wakes up every 10ms
> to process RTP and TDM audio data.
> So we have a low RT priority process with one thread per cpu.
> Since they are RT they usually get scheduled on the same cpu as last lime.
> I think this simple program will have the desired effect:
> A main process that does:
> 	syscall(SYS_clock_gettime, CLOCK_MONOTONIC, &start_time);
> 	start_time += 1sec;
> 	for (n = 1; n < num_cpu; n++)
> 		pthread_create(thread_code, start_time);
> 	thread_code(start_time);
> with:
> thread_code(ts)
> {
> 	for (;;) {
> 		ts += 10ms;
> 		syscall(SYS_clock_nanosleep, CLOCK_MONOTONIC, TIMER_ABSTIME, &ts, NULL);
> 		do_work();
> 	}
> 
> So all the threads wake up at exactly the same time every 10ms.
> (You need to use syscall(), don't look at what glibc does.)
> 
> On my system the program wasn't doing anything, so do_work() was empty.
> What matters is whether all the threads end up running at the same time.
> I managed that using pthread_broadcast(), but the clock code above
> ought to be worse (and I've since changed the daemon to work that way
> to avoid all this issues with pthread_broadcast() being sequential
> and threads not running because the target cpu is running an ISR or
> just looping in kernel).
> 
> The process that gets 'hit' is anything cpu bound.
> Even a shell loop (eg while :; do ;: done) but with a counter will do.
> 
> Without the 'trigger' program, it will (mostly) sit on one cpu and the
> clock frequency of that cpu will increase to (say) 3GHz while the other
> all run at 800Mhz.
> But the 'trigger' program runs threads on all the cpu at the same time.
> So the 'hit' program is pre-empted and is later rescheduled on a
> different cpu - running at 800MHz.
> The cpu speed increases, but 10ms later it gets bounced again.

Sorry I haven't tried creating this test yet.

> The real issue is that the cpu speed is associated with the cpu, not
> the process running on it.

So if the 'hit' program gets scheduled to a CPU that is running at 3GHz
then we don't expect a dramatic performance drop? Setting scaling_governor
to "performance" would be an interesting test.