linux-kernel - Re: [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel overcount issues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e635c41e-55be-408d-ab43-7875021a9ecc@intel.com>
Date: Tue, 15 Jul 2025 21:27:40 +0800
From: Xiaoyao Li <xiaoyao.li@...el.com>
To: Dapeng Mi <dapeng1.mi@...ux.intel.com>,
 Sean Christopherson <seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
 Jim Mattson <jmattson@...gle.com>, Mingwei Zhang <mizhang@...gle.com>,
 Zide Chen <zide.chen@...el.com>, Das Sandipan <Sandipan.Das@....com>,
 Shukla Manali <Manali.Shukla@....com>, Yi Lai <yi1.lai@...el.com>,
 Dapeng Mi <dapeng1.mi@...el.com>, dongsheng <dongsheng.x.zhang@...el.com>
Subject: Re: [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel
 overcount issues

On 7/13/2025 1:49 AM, Dapeng Mi wrote:
> From: dongsheng <dongsheng.x.zhang@...el.com>
> 
> For Intel Atom CPUs, the PMU events "Instruction Retired" or
> "Branch Instruction Retired" may be overcounted for some certain
> instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
> and complex SGX/SMX/CSTATE instructions/flows.
> 
> The detailed information can be found in the errata (section SRF7):
> https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
> 
> For the Atom platforms before Sierra Forest (including Sierra Forest),
> Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
> be overcounted on these certain instructions, but for Clearwater Forest
> only "Instruction Retired" event is overcounted on these instructions.
> 
> So add a helper detect_inst_overcount_flags() to detect whether the
> platform has the overcount issue and the later patches would relax the
> precise count check by leveraging the gotten overcount flags from this
> helper.
> 
> Signed-off-by: dongsheng <dongsheng.x.zhang@...el.com>
> [Rewrite comments and commit message - Dapeng]
> Signed-off-by: Dapeng Mi <dapeng1.mi@...ux.intel.com>
> Tested-by: Yi Lai <yi1.lai@...el.com>
> ---
>   lib/x86/processor.h | 17 ++++++++++++++++
>   x86/pmu.c           | 47 +++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 64 insertions(+)
> 
> diff --git a/lib/x86/processor.h b/lib/x86/processor.h
> index 62f3d578..3f475c21 100644
> --- a/lib/x86/processor.h
> +++ b/lib/x86/processor.h
> @@ -1188,4 +1188,21 @@ static inline bool is_lam_u57_enabled(void)
>   	return !!(read_cr3() & X86_CR3_LAM_U57);
>   }
>   
> +static inline u32 x86_family(u32 eax)
> +{
> +	u32 x86;
> +
> +	x86 = (eax >> 8) & 0xf;
> +
> +	if (x86 == 0xf)
> +		x86 += (eax >> 20) & 0xff;
> +
> +	return x86;
> +}
> +
> +static inline u32 x86_model(u32 eax)
> +{
> +	return ((eax >> 12) & 0xf0) | ((eax >> 4) & 0x0f);
> +}

It seems to copy the implementation of kvm selftest.

I need to point it out that it's not correct (because I fixed the 
similar issue on QEMU recently).

We cannot count Extended Model ID unconditionally. Intel counts Extended 
Model when (base) Family is 0x6 or 0xF, while AMD counts EXtended Model 
when (base) Family is 0xF.

You can refer to kernel's x86_model() in arch/x86/lib/cpu.c, while it 
optimizes the condition to "family >= 0x6", which seems to have the 
assumption that Intel doesn't have processor with family ID from 7 to 
0xe and AMD doesn't have processor with family ID from 6 to 0xe.