lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260116071513.12134-1-AlanSong-oc@zhaoxin.com>
Date: Fri, 16 Jan 2026 15:15:10 +0800
From: AlanSong-oc <AlanSong-oc@...oxin.com>
To: <herbert@...dor.apana.org.au>, <davem@...emloft.net>,
	<ebiggers@...nel.org>, <Jason@...c4.com>, <ardb@...nel.org>,
	<linux-crypto@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<x86@...nel.org>
CC: <CobeChen@...oxin.com>, <TonyWWang-oc@...oxin.com>, <YunShen@...oxin.com>,
	<GeorgeXue@...oxin.com>, <LeoLiu-oc@...oxin.com>, <HansHu@...oxin.com>,
	AlanSong-oc <AlanSong-oc@...oxin.com>
Subject: [PATCH v3 0/3] lib/crypto: x86/sha: Add PHE Extensions support

This series adds support for PHE Extensions optimized SHA1 and SHA256
transform functions for Zhaoxin processors in lib/crypto, and disables
the padlock-sha driver on Zhaoxin platforms due to self-test failures.

The table below shows the benchmark results before and after this patch
series by using CRYPTO_LIB_BENCHMARK on Zhaoxin KX-7000 platform,
highlighting the achieved speedups.

+---------+-------------------------+--------------------------+
|         |         SHA1            |          SHA256          |
+---------+--------+----------------+--------+-----------------+
|   Len   | Before |      After     | Before |      After      |
+---------+--------+----------------+--------+-----------------+
|      1* |    3** |    8 (2.67x)   |    2   |    7 (3.50x)    |
|     16  |   52   |  125 (2.40x)   |   35   |  119 (3.40x)    |
|     64  |  114   |  318 (2.79x)   |   74   |  280 (3.78x)    |
|    127  |  154   |  440 (2.86x)   |   99   |  387 (3.91x)    |
|    128  |  160   |  492 (3.08x)   |  103   |  427 (4.15x)    |
|    200  |  189   |  605 (3.20x)   |  123   |  537 (4.37x)    |
|    256  |  199   |  676 (3.40x)   |  128   |  582 (4.55x)    |
|    511  |  223   |  794 (3.56x)   |  144   |  679 (4.72x)    |
|    512  |  225   |  833 (3.70x)   |  146   |  714 (4.89x)    |
|   1024  |  243   |  941 (3.87x)   |  157   |  796 (5.07x)    |
|   3173  |  259   | 1044 (4.03x)   |  167   |  883 (5.28x)    |
|   4096  |  257   | 1044 (4.06x)   |  166   |  876 (5.28x)    |
|  16384  |  261   | 1073 (4.11x)   |  169   |  899 (5.32x)    |
+---------+--------+----------------+--------+-----------------+
*: The length of each data block to be processed by one complete SHA
   sequence.
**: The throughput of processing data blocks, unit is Mb/s.

After applying this patch series, the KUnit test suites for SHA1 and
SHA256 pass successfully on Zhaoxin platforms. The following shows the
detailed test logs:

[    5.993700]     # Subtest: sha1
[    5.996813]     # module: sha1_kunit
[    5.996814]     1..11
[    6.003399]     ok 1 test_hash_test_vectors
[    6.012489]     ok 2 test_hash_all_lens_up_to_4096
[    6.028511]     ok 3 test_hash_incremental_updates
[    6.035766]     ok 4 test_hash_buffer_overruns
[    6.043445]     ok 5 test_hash_overlaps
[    6.050315]     ok 6 test_hash_alignment_consistency
[    6.054994]     ok 7 test_hash_ctx_zeroization
[    6.127778]     ok 8 test_hash_interrupt_context_1
[    6.774847]     ok 9 test_hash_interrupt_context_2
[    6.810745]     ok 10 test_hmac
[    6.835169]     # benchmark_hash: len=1: 8 MB/s
[    6.847167]     # benchmark_hash: len=16: 125 MB/s
[    6.862114]     # benchmark_hash: len=64: 318 MB/s
[    6.878173]     # benchmark_hash: len=127: 440 MB/s
[    6.893081]     # benchmark_hash: len=128: 492 MB/s
[    6.907976]     # benchmark_hash: len=200: 605 MB/s
[    6.922658]     # benchmark_hash: len=256: 676 MB/s
[    6.937558]     # benchmark_hash: len=511: 794 MB/s
[    6.951994]     # benchmark_hash: len=512: 833 MB/s
[    6.966262]     # benchmark_hash: len=1024: 941 MB/s
[    6.980295]     # benchmark_hash: len=3173: 1044 MB/s
[    6.994494]     # benchmark_hash: len=4096: 1044 MB/s
[    7.008728]     # benchmark_hash: len=16384: 1073 MB/s
[    7.014515]     ok 11 benchmark_hash
[    7.019628] # sha1: pass:11 fail:0 skip:0 total:11
[    7.023170] # Totals: pass:11 fail:0 skip:0 total:11
[    7.027916] ok 5 sha1

[    7.767257]     # Subtest: sha256
[    7.770542]     # module: sha256_kunit
[    7.770544]     1..15
[    7.777383]     ok 1 test_hash_test_vectors
[    7.788563]     ok 2 test_hash_all_lens_up_to_4096
[    7.806090]     ok 3 test_hash_incremental_updates
[    7.813553]     ok 4 test_hash_buffer_overruns
[    7.822384]     ok 5 test_hash_overlaps
[    7.829388]     ok 6 test_hash_alignment_consistency
[    7.833843]     ok 7 test_hash_ctx_zeroization
[    7.915191]     ok 8 test_hash_interrupt_context_1
[    8.362312]     ok 9 test_hash_interrupt_context_2
[    8.401607]     ok 10 test_hmac
[    8.415458]     ok 11 test_sha256_finup_2x
[    8.419397]     ok 12 test_sha256_finup_2x_defaultctx
[    8.424107]     ok 13 test_sha256_finup_2x_hugelen
[    8.451289]     # benchmark_hash: len=1: 7 MB/s
[    8.465372]     # benchmark_hash: len=16: 119 MB/s
[    8.481760]     # benchmark_hash: len=64: 280 MB/s
[    8.499344]     # benchmark_hash: len=127: 387 MB/s
[    8.515800]     # benchmark_hash: len=128: 427 MB/s
[    8.531970]     # benchmark_hash: len=200: 537 MB/s
[    8.548241]     # benchmark_hash: len=256: 582 MB/s
[    8.564838]     # benchmark_hash: len=511: 679 MB/s
[    8.580872]     # benchmark_hash: len=512: 714 MB/s
[    8.596858]     # benchmark_hash: len=1024: 796 MB/s
[    8.612567]     # benchmark_hash: len=3173: 883 MB/s
[    8.628546]     # benchmark_hash: len=4096: 876 MB/s
[    8.644482]     # benchmark_hash: len=16384: 899 MB/s
[    8.649773]     ok 14 benchmark_hash
[    8.655505]     ok 15 benchmark_sha256_finup_2x # SKIP not relevant
[    8.659065] # sha256: pass:14 fail:0 skip:1 total:15
[    8.665276] # Totals: pass:14 fail:0 skip:1 total:15
[    8.670195] ok 7 sha256

Changes in v3:
- Implement PHE Extensions optimized SHA1 and SHA256 transform functions
  using inline assembly instead of separate assembly files
- Eliminate unnecessary casts
- Add CONFIG_CPU_SUP_ZHAOXIN check to compile out the code when disabled
- Use 'boot_cpu_data.x86' to identify the CPU family instead of
  'cpu_data(0).x86'
- Only check X86_FEATURE_PHE_EN for CPU support, consistent with other
  CPU feature checks.
- Disable the padlock-sha driver on Zhaoxin processors with CPU family
  0x07 and newer.

Changes in v2:
- Add Zhaoxin support to lib/crypto instead of extending the existing
  padlock-sha driver

AlanSong-oc (3):
  crypto: padlock-sha - Disable for Zhaoxin processor
  lib/crypto: x86/sha1: PHE Extensions optimized SHA1 transform function
  lib/crypto: x86/sha256: PHE Extensions optimized SHA256 transform
    function

 drivers/crypto/padlock-sha.c |  7 +++++++
 lib/crypto/x86/sha1.h        | 25 +++++++++++++++++++++++++
 lib/crypto/x86/sha256.h      | 25 +++++++++++++++++++++++++
 3 files changed, 57 insertions(+)

-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ