[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXEDJ=c-OkTDOu=5o+ic8LXpWA6R2zMBsngFSpiyGB--Ww@mail.gmail.com>
Date: Tue, 14 Dec 2021 16:59:00 +0100
From: Ard Biesheuvel <ardb@...nel.org>
To: Xiaokang Qian <Xiaokang.Qian@....com>
Cc: Will Deacon <will@...nel.org>, Eric Biggers <ebiggers@...nel.org>,
Herbert Xu <herbert@...dor.apana.org.au>,
"David S. Miller" <davem@...emloft.net>,
Catalin Marinas <Catalin.Marinas@....com>, nd <nd@....com>,
Linux Crypto Mailing List <linux-crypto@...r.kernel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] crypto: arm64/gcm-ce - unroll factors to 4-way interleave
of aes and ghash
On Tue, 14 Dec 2021 at 02:40, Xiaokang Qian <Xiaokang.Qian@....com> wrote:
>
> Hi Will:
> I will post the update version 2 of this patch today or tomorrow.
> Sorry for the delay.
>
Great, but please make sure you run the extended test suite.
I applied this version of the patch to test the performance delta
between the old and the new version on TX2, but it hit a failure in
the self test:
[ 0.592203] alg: aead: gcm-aes-ce decryption unexpectedly succeeded
on test vector "random: alen=91 plen=5326 authsize=16 klen=32
novrfy=1"; expected_error=-EBADMSG, cfg="random: inplace use_finup
src_divs=[100.0%@...79] key_offset=43"
It's non-deterministic, though, so it may take a few attempts to reproduce it.
As for the performance delta, your code is 18% slower on TX2 for 1420
byte packets using AES-256 (and 9% slower on AES-192). In your
results, AES-256 does not outperform the old code as much as it does
with smaller key sizes either.
Is this something that can be solved? If not, the numbers are not as
appealing, to be honest, given the substantial performance regressions
on the other micro-architecture.
--
Ard.
Tcrypt output follows
OLD CODE
testing speed of gcm(aes) (gcm-aes-ce) encryption
test 0 (128 bit key, 16 byte blocks): 2023626 operations in 1 seconds
(32378016 bytes)
test 1 (128 bit key, 64 byte blocks): 2005175 operations in 1 seconds
(128331200 bytes)
test 2 (128 bit key, 256 byte blocks): 1408367 operations in 1 seconds
(360541952 bytes)
test 3 (128 bit key, 512 byte blocks): 1011877 operations in 1 seconds
(518081024 bytes)
test 4 (128 bit key, 1024 byte blocks): 646552 operations in 1 seconds
(662069248 bytes)
test 5 (128 bit key, 1420 byte blocks): 490188 operations in 1 seconds
(696066960 bytes)
test 6 (128 bit key, 4096 byte blocks): 204423 operations in 1 seconds
(837316608 bytes)
test 7 (128 bit key, 8192 byte blocks): 105149 operations in 1 seconds
(861380608 bytes)
test 8 (192 bit key, 16 byte blocks): 1924506 operations in 1 seconds
(30792096 bytes)
test 9 (192 bit key, 64 byte blocks): 1944413 operations in 1 seconds
(124442432 bytes)
test 10 (192 bit key, 256 byte blocks): 1337001 operations in 1
seconds (342272256 bytes)
test 11 (192 bit key, 512 byte blocks): 941146 operations in 1 seconds
(481866752 bytes)
test 12 (192 bit key, 1024 byte blocks): 590614 operations in 1
seconds (604788736 bytes)
test 13 (192 bit key, 1420 byte blocks): 443363 operations in 1
seconds (629575460 bytes)
test 14 (192 bit key, 4096 byte blocks): 182890 operations in 1
seconds (749117440 bytes)
test 15 (192 bit key, 8192 byte blocks): 93813 operations in 1 seconds
(768516096 bytes)
test 16 (256 bit key, 16 byte blocks): 1886970 operations in 1 seconds
(30191520 bytes)
test 17 (256 bit key, 64 byte blocks): 1893574 operations in 1 seconds
(121188736 bytes)
test 18 (256 bit key, 256 byte blocks): 1245478 operations in 1
seconds (318842368 bytes)
test 19 (256 bit key, 512 byte blocks): 865507 operations in 1 seconds
(443139584 bytes)
test 20 (256 bit key, 1024 byte blocks): 537822 operations in 1
seconds (550729728 bytes)
test 21 (256 bit key, 1420 byte blocks): 401451 operations in 1
seconds (570060420 bytes)
test 22 (256 bit key, 4096 byte blocks): 164378 operations in 1
seconds (673292288 bytes)
test 23 (256 bit key, 8192 byte blocks): 84205 operations in 1 seconds
(689807360 bytes)
NEW CODE
testing speed of gcm(aes) (gcm-aes-ce) encryption
test 0 (128 bit key, 16 byte blocks): 1894587 operations in 1 seconds
(30313392 bytes)
test 1 (128 bit key, 64 byte blocks): 1910971 operations in 1 seconds
(122302144 bytes)
test 2 (128 bit key, 256 byte blocks): 1360037 operations in 1 seconds
(348169472 bytes)
test 3 (128 bit key, 512 byte blocks): 985577 operations in 1 seconds
(504615424 bytes)
test 4 (128 bit key, 1024 byte blocks): 569656 operations in 1 seconds
(583327744 bytes)
test 5 (128 bit key, 1420 byte blocks): 462129 operations in 1 seconds
(656223180 bytes)
test 6 (128 bit key, 4096 byte blocks): 215284 operations in 1 seconds
(881803264 bytes)
test 7 (128 bit key, 8192 byte blocks): 115459 operations in 1 seconds
(945840128 bytes)
test 8 (192 bit key, 16 byte blocks): 1825915 operations in 1 seconds
(29214640 bytes)
test 9 (192 bit key, 64 byte blocks): 1836850 operations in 1 seconds
(117558400 bytes)
test 10 (192 bit key, 256 byte blocks): 1281626 operations in 1
seconds (328096256 bytes)
test 11 (192 bit key, 512 byte blocks): 913114 operations in 1 seconds
(467514368 bytes)
test 12 (192 bit key, 1024 byte blocks): 504804 operations in 1
seconds (516919296 bytes)
test 13 (192 bit key, 1420 byte blocks): 405749 operations in 1
seconds (576163580 bytes)
test 14 (192 bit key, 4096 byte blocks): 183999 operations in 1
seconds (753659904 bytes)
test 15 (192 bit key, 8192 byte blocks): 97914 operations in 1 seconds
(802111488 bytes)
test 16 (256 bit key, 16 byte blocks): 1776659 operations in 1 seconds
(28426544 bytes)
test 17 (256 bit key, 64 byte blocks): 1781110 operations in 1 seconds
(113991040 bytes)
test 18 (256 bit key, 256 byte blocks): 1206511 operations in 1
seconds (308866816 bytes)
test 19 (256 bit key, 512 byte blocks): 846284 operations in 1 seconds
(433297408 bytes)
test 20 (256 bit key, 1024 byte blocks): 424405 operations in 1
seconds (434590720 bytes)
test 21 (256 bit key, 1420 byte blocks): 331558 operations in 1
seconds (470812360 bytes)
test 22 (256 bit key, 4096 byte blocks): 143821 operations in 1
seconds (589090816 bytes)
test 23 (256 bit key, 8192 byte blocks): 75641 operations in 1 seconds
(619651072 bytes)
Powered by blists - more mailing lists