[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260118003120.GF74518@quark>
Date: Sat, 17 Jan 2026 16:31:20 -0800
From: Eric Biggers <ebiggers@...nel.org>
To: AlanSong-oc <AlanSong-oc@...oxin.com>
Cc: herbert@...dor.apana.org.au, davem@...emloft.net, Jason@...c4.com,
ardb@...nel.org, linux-crypto@...r.kernel.org,
linux-kernel@...r.kernel.org, x86@...nel.org, CobeChen@...oxin.com,
TonyWWang-oc@...oxin.com, YunShen@...oxin.com,
GeorgeXue@...oxin.com, LeoLiu-oc@...oxin.com, HansHu@...oxin.com
Subject: Re: [PATCH v3 2/3] lib/crypto: x86/sha1: PHE Extensions optimized
SHA1 transform function
On Fri, Jan 16, 2026 at 03:15:12PM +0800, AlanSong-oc wrote:
> Zhaoxin CPUs have implemented the SHA(Secure Hash Algorithm) as its CPU
> instructions by PHE(Padlock Hash Engine) Extensions, including XSHA1,
> XSHA256, XSHA384 and XSHA512 instructions.
>
> With the help of implementation of SHA in hardware instead of software,
> can develop applications with higher performance, more security and more
> flexibility.
>
> This patch includes the XSHA1 instruction optimized implementation of
> SHA-1 transform function.
>
> Signed-off-by: AlanSong-oc <AlanSong-oc@...oxin.com>
Please include the information I've asked for (benchmark results, test
results, and link to the specification) directly in the commit message.
> +#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
> +#define PHE_ALIGNMENT 16
> +static void sha1_blocks_phe(struct sha1_block_state *state,
> + const u8 *data, size_t nblocks)
The IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN) should go in the CPU feature
check, so that the code will be parsed regardless of the setting. That
reduces the chance that future changes will cause compilation errors.
> + /*
> + * XSHA1 requires %edi to point to a 32-byte, 16-byte-aligned
> + * buffer on Zhaoxin processors.
> + */
This seems implausible. In 64-bit mode a pointer can't fit in %edi. I
thought you mentioned that this instruction is 64-bit compatible? You
may have meant %rdi.
Interestingly, the spec you provided specifically says the registers
operated on are %eax, %ecx, %esi, and %edi.
So assuming the code works, perhaps both the spec and your code comment
are incorrect?
These errors don't really confidence in this instruction.
> + memcpy(dst, state, SHA1_DIGEST_SIZE);
> + asm volatile(".byte 0xf3,0x0f,0xa6,0xc8"
> + : "+S"(data), "+D"(dst)
> + : "a"((long)-1), "c"(nblocks));
> + memcpy(state, dst, SHA1_DIGEST_SIZE);
Is the reason for using '.byte' that the GNU and clang assemblers don't
implement the mnemonic this Zhaoxin-specific instruction? The spec
implies that the intended mnemonic is "rep sha1".
If that's correct, could you add a comment like /* rep sha1 */ so that
it's clear what the intended instruction is?
Also, the spec describes all four registers as both input and output
registers. Yet your inline asm marks %rax and %rcx as inputs only.
> @@ -59,6 +79,11 @@ static void sha1_mod_init_arch(void)
> {
> if (boot_cpu_has(X86_FEATURE_SHA_NI)) {
> static_call_update(sha1_blocks_x86, sha1_blocks_ni);
> +#if IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN)
> + } else if (boot_cpu_has(X86_FEATURE_PHE_EN)) {
> + if (boot_cpu_data.x86 >= 0x07)
> + static_call_update(sha1_blocks_x86, sha1_blocks_phe);
> +#endif
I think it should be:
} else if (IS_ENABLED(CONFIG_CPU_SUP_ZHAOXIN) &&
boot_cpu_has(X86_FEATURE_PHE_EN) &&
boot_cpu_data.x86 >= 0x07) {
static_call_update(sha1_blocks_x86, sha1_blocks_phe);
... so (a) the code will be parsed even when !CONFIG_CPU_SUP_ZHAOXIN,
and (b) functions won't be unnecessarily disabled when
boot_cpu_has(X86_FEATURE_PHE_EN) && boot_cpu_data.x86 < 0x07).
As before, all these comments apply to the SHA-256 patch too.
- Eric
Powered by blists - more mailing lists