linux-kernel - Re: [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201208092828.GA27920@zn.tnic>
Date:   Tue, 8 Dec 2020 10:28:28 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Kyung Min Park <kyung.min.park@...el.com>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
        pbonzini@...hat.com, sean.j.christopherson@...el.com,
        jmattson@...gle.com, joro@...tes.org, vkuznets@...hat.com,
        wanpengli@...cent.com, cathy.zhang@...el.com
Subject: Re: [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag

On Mon, Dec 07, 2020 at 07:34:40PM -0800, Kyung Min Park wrote:
> Enumerate AVX512 Half-precision floating point (FP16) CPUID feature
> flag. Compared with using FP32, using FP16 cut the number of bits
> required for storage in half, reducing the exponent from 8 bits to 5,
> and the mantissa from 23 bits to 10. Using FP16 also enables developers
> to train and run inference on deep learning models fast when all
> precision or magnitude (FP32) is not needed.
> 
> A processor supports AVX512 FP16 if CPUID.(EAX=7,ECX=0):EDX[bit 23]
> is present. The AVX512 FP16 requires AVX512BW feature be implemented
> since the instructions for manipulating 32bit masks are associated with
> AVX512BW.
> 
> The only in-kernel usage of this is kvm passthrough. The CPU feature
> flag is shown as "avx512_fp16" in /proc/cpuinfo.
> 
> Signed-off-by: Kyung Min Park <kyung.min.park@...el.com>
> Acked-by: Dave Hansen <dave.hansen@...el.com>
> Reviewed-by: Tony Luck <tony.luck@...el.com>
> ---
>  arch/x86/include/asm/cpufeatures.h | 1 +
>  arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index b6b9b3407c22..bec37ec7101e 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -375,6 +375,7 @@
>  #define X86_FEATURE_TSXLDTRK		(18*32+16) /* TSX Suspend Load Address Tracking */
>  #define X86_FEATURE_PCONFIG		(18*32+18) /* Intel PCONFIG */
>  #define X86_FEATURE_ARCH_LBR		(18*32+19) /* Intel ARCH LBR */
> +#define X86_FEATURE_AVX512_FP16		(18*32+23) /* AVX512 FP16 */
>  #define X86_FEATURE_SPEC_CTRL		(18*32+26) /* "" Speculation Control (IBRS + IBPB) */
>  #define X86_FEATURE_INTEL_STIBP		(18*32+27) /* "" Single Thread Indirect Branch Predictors */
>  #define X86_FEATURE_FLUSH_L1D		(18*32+28) /* Flush L1D cache */
> diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
> index d502241995a3..42af31b64c2c 100644
> --- a/arch/x86/kernel/cpu/cpuid-deps.c
> +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> @@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = {
>  	{ X86_FEATURE_CQM_MBM_TOTAL,		X86_FEATURE_CQM_LLC   },
>  	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
>  	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
> +	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
>  	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
>  	{ X86_FEATURE_PER_THREAD_MBA,		X86_FEATURE_MBA       },
>  	{}
> -- 

Acked-by: Borislav Petkov <bp@...e.de>

Paolo, you can pick those up if you prefer.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette