lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46fbbc81-d8c5-fd97-fd4f-71dd96c2c522@linux.alibaba.com>
Date:   Wed, 30 Jun 2021 20:37:22 +0800
From:   Tianjia Zhang <tianjia.zhang@...ux.alibaba.com>
To:     Herbert Xu <herbert@...dor.apana.org.au>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Biggers <ebiggers@...gle.com>,
        Eric Biggers <ebiggers@...nel.org>,
        Gilad Ben-Yossef <gilad@...yossef.com>,
        Ard Biesheuvel <ardb@...nel.org>,
        "Markku-Juhani O . Saarinen" <mjos@....fi>,
        Jussi Kivilinna <jussi.kivilinna@....fi>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        linux-crypto@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org,
        Jia Zhang <zhang.jia@...ux.alibaba.com>,
        "YiLin . Li" <YiLin.Li@...ux.alibaba.com>
Subject: Re: [PATCH v2 0/4] Introduce x86 assembler accelerated implementation
 for SM4 algorithm

Hi,

Any comment?

Cheers,
Tianjia

On 6/24/21 4:08 PM, Tianjia Zhang wrote:
> This patchset extracts the public SM4 algorithm as a separate library,
> At the same time, the acceleration implementation of SM4 in arm64 was
> adjusted to adapt to this SM4 library. Then introduces an accelerated
> implementation of the instruction set on x86.
> 
> This optimization supports the four modes of SM4, ECB, CBC, CFB, and
> CTR. Since CBC and CFB do not support multiple block parallel
> encryption, the optimization effect is not obvious. And all selftests
> have passed already.
> 
> The main algorithm implementation comes from SM4 AES-NI work by
> libgcrypt and Markku-Juhani O. Saarinen at:
> https://github.com/mjosaarinen/sm4ni
> 
> Benchmark on Intel Xeon Cascadelake, the data comes from the mode 218
> and mode 518 of tcrypt. The abscissas are blocks of different lengths.
> The data is tabulated and the unit is Mb/s:
> 
> sm4-generic   |    16      64     128     256    1024    1420    4096
>        ECB enc | 40.99   46.50   48.05   48.41   49.20   49.25   49.28
>        ECB dec | 41.07   46.99   48.15   48.67   49.20   49.25   49.29
>        CBC enc | 37.71   45.28   46.77   47.60   48.32   48.37   48.40
>        CBC dec | 36.48   44.82   46.43   47.45   48.23   48.30   48.36
>        CFB enc | 37.94   44.84   46.12   46.94   47.57   47.46   47.68
>        CFB dec | 37.50   42.84   43.74   44.37   44.85   44.80   44.96
>        CTR enc | 39.20   45.63   46.75   47.49   48.09   47.85   48.08
>        CTR dec | 39.64   45.70   46.72   47.47   47.98   47.88   48.06
> sm4-aesni-avx
>        ECB enc | 33.75  134.47  221.64  243.43  264.05  251.58  258.13
>        ECB dec | 34.02  134.92  223.11  245.14  264.12  251.04  258.33
>        CBC enc | 38.85   46.18   47.67   48.34   49.00   48.96   49.14
>        CBC dec | 33.54  131.29  223.88  245.27  265.50  252.41  263.78
>        CFB enc | 38.70   46.10   47.58   48.29   49.01   48.94   49.19
>        CFB dec | 32.79  128.40  223.23  244.87  265.77  253.31  262.79
>        CTR enc | 32.58  122.23  220.29  241.16  259.57  248.32  256.69
>        CTR dec | 32.81  122.47  218.99  241.54  258.42  248.58  256.61
> 
> ---
> v2 changes:
>    * SM4 library functions use "sm4_" prefix instead of "crypto_" prefix
>    * sm4-aesni-avx supports accelerated implementation of four specific modes
>    * tcrypt benchmark supports sm4-aesni-avx
>    * fixes of other reviews
> 
> Tianjia Zhang (4):
>    crypto: sm4 - create SM4 library based on sm4 generic code
>    crypto: arm64/sm4-ce - Make dependent on sm4 library instead of
>      sm4-generic
>    crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation
>    crypto: tcrypt - add the asynchronous speed test for SM4
> 
>   arch/arm64/crypto/Kconfig              |   2 +-
>   arch/arm64/crypto/sm4-ce-glue.c        |  20 +-
>   arch/x86/crypto/Makefile               |   3 +
>   arch/x86/crypto/sm4-aesni-avx-asm_64.S | 684 +++++++++++++++++++++++++
>   arch/x86/crypto/sm4_aesni_avx_glue.c   | 537 +++++++++++++++++++
>   crypto/Kconfig                         |  22 +
>   crypto/sm4_generic.c                   | 180 +------
>   crypto/tcrypt.c                        |  26 +-
>   include/crypto/sm4.h                   |  29 +-
>   lib/crypto/Kconfig                     |   3 +
>   lib/crypto/Makefile                    |   3 +
>   lib/crypto/sm4.c                       | 184 +++++++
>   12 files changed, 1515 insertions(+), 178 deletions(-)
>   create mode 100644 arch/x86/crypto/sm4-aesni-avx-asm_64.S
>   create mode 100644 arch/x86/crypto/sm4_aesni_avx_glue.c
>   create mode 100644 lib/crypto/sm4.c
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ