[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250206073948.181792-1-ebiggers@kernel.org>
Date: Wed, 5 Feb 2025 23:39:42 -0800
From: Eric Biggers <ebiggers@...nel.org>
To: linux-kernel@...r.kernel.org
Cc: linux-crypto@...r.kernel.org,
x86@...nel.org,
linux-block@...r.kernel.org,
Ard Biesheuvel <ardb@...nel.org>,
Keith Busch <kbusch@...nel.org>,
Kent Overstreet <kent.overstreet@...ux.dev>,
"Martin K . Petersen" <martin.petersen@...cle.com>
Subject: [PATCH v3 0/6] x86 CRC optimizations
This patchset applies to the crc tree and is also available at:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git crc-x86-v3
This series replaces the existing x86 PCLMULQDQ optimized CRC code with
new code that is shared among the different CRC variants and also adds
VPCLMULQDQ support, greatly improving performance on recent CPUs. The
last patch wires up the same optimization to crc64_be() and crc64_nvme()
(a.k.a. the old "crc64_rocksoft") which previously were unoptimized,
improving the performance of those CRC functions by as much as 100x.
crc64_be is used by bcachefs, and crc64_nvme is used by blk-integrity.
Changed in v3:
- It's back to just the x86 patches now, since I've applied the CRC64
library rework patches.
- Added review and ack tags.
- Made more improvements to crc-pclmul-template.S and gen-crc-consts.py,
such as improving the comments that explain some of the steps,
tweaking the exact choice of constants in certain cases where more
than one is equivalent, sharing a bit more of the source code between
lsb and msb-first CRCs, and eliminating an unnecessary instruction.
Changed in v2:
- Rebased onto upstream
- Added CRC64 library rework patches
- Capitalized YMM and ZMM
- Moved gen-crc-consts.py from scripts/crc/ to just scripts/
- Renamed crc-pclmul-template-glue.h to just crc-pclmul-template.h
- The asm functions that use longer vectors no longer tail-call the ones
that use shorter vectors in order to handle short lengths. Each
function now handles all lengths >= 16 bytes directly.
- Made various other improvements to crc-pclmul-template.S and
gen-crc-consts.py
- It's 2025 now; updated the copyright statements
- Improved commit messages
- Added ack tags
Eric Biggers (6):
x86: move ZMM exclusion list into CPU feature flag
scripts/gen-crc-consts: add gen-crc-consts.py
x86/crc: add "template" for [V]PCLMULQDQ based CRC functions
x86/crc32: implement crc32_le using new template
x86/crc-t10dif: implement crc_t10dif using new template
x86/crc64: implement crc64_be and crc64_nvme using new template
MAINTAINERS | 1 +
arch/x86/Kconfig | 3 +-
arch/x86/crypto/aesni-intel_glue.c | 22 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/intel.c | 22 ++
arch/x86/lib/Makefile | 5 +-
arch/x86/lib/crc-pclmul-consts.h | 195 ++++++++++
arch/x86/lib/crc-pclmul-template.S | 584 ++++++++++++++++++++++++++++
arch/x86/lib/crc-pclmul-template.h | 81 ++++
arch/x86/lib/crc-t10dif-glue.c | 23 +-
arch/x86/lib/crc16-msb-pclmul.S | 6 +
arch/x86/lib/crc32-glue.c | 37 +-
arch/x86/lib/crc32-pclmul.S | 219 +----------
arch/x86/lib/crc64-glue.c | 50 +++
arch/x86/lib/crc64-pclmul.S | 7 +
arch/x86/lib/crct10dif-pcl-asm_64.S | 332 ----------------
scripts/gen-crc-consts.py | 239 ++++++++++++
17 files changed, 1214 insertions(+), 613 deletions(-)
create mode 100644 arch/x86/lib/crc-pclmul-consts.h
create mode 100644 arch/x86/lib/crc-pclmul-template.S
create mode 100644 arch/x86/lib/crc-pclmul-template.h
create mode 100644 arch/x86/lib/crc16-msb-pclmul.S
create mode 100644 arch/x86/lib/crc64-glue.c
create mode 100644 arch/x86/lib/crc64-pclmul.S
delete mode 100644 arch/x86/lib/crct10dif-pcl-asm_64.S
create mode 100755 scripts/gen-crc-consts.py
base-commit: 5b793bbee96c666ca14db8409509abd73a3e0130
--
2.48.1
Powered by blists - more mailing lists