lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20221219220223.3982176-1-elliott@hpe.com>
Date:   Mon, 19 Dec 2022 16:02:10 -0600
From:   Robert Elliott <elliott@....com>
To:     herbert@...dor.apana.org.au, davem@...emloft.net, Jason@...c4.com,
        ardb@...nel.org, ap420073@...il.com, David.Laight@...LAB.COM,
        ebiggers@...nel.org, tim.c.chen@...ux.intel.com, peter@...jl.ca,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        dave.hansen@...ux.intel.com
Cc:     linux-crypto@...r.kernel.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, Robert Elliott <elliott@....com>
Subject: [PATCH 00/13] crypto: x86 - yield FPU context during long loops

This is an offshoot of the previous patch series at:
  https://lore.kernel.org/linux-crypto/20221219202910.3063036-1-elliott@hpe.com

Add a kernel_fpu_yield() function for x86 crypto drivers to call
periodically during long loops.

Test results
============
I created 28 tcrypt modules so modprobe can run concurrent tests,
added 1 MiB functional and speed tests to tcrypt, and ran three processes
spawning 28 subprocesses (one per physical CPU core) each looping forever
through all the tcrypt test modes. This keeps the system quite busy,
generating RCU stalls and soft lockups during both generic and x86
crypto function processing.

In conjunction with these patch series:
* [PATCH 0/8] crypto: kernel-doc for assembly language
  https://lore.kernel.org/linux-crypto/20221219185555.433233-1-elliott@hpe.com
* [PATCH 0/3] crypto/rcu: suppress unnecessary CPU stall warnings
  https://lore.kernel.org/linux-crypto/20221219202910.3063036-1-elliott@hpe.com
* [PATCH 0/3] crypto: yield at end of operations
  https://lore.kernel.org/linux-crypto/20221219203733.3063192-1-elliott@hpe.com

while using the default RCU values (60 s stalls, 21 s expedited stalls),
several nights of testing did not result in any RCU stall warnings or soft
lockups in any of these preemption modes:
   preempt=none
   preempt=voluntary
   preempt=full

Setting the shortest possible RCU timeouts (3 s, 20 ms) did still result
in RCU stalls, but only about one every 2 hours, and not occurring
on particular modules like sha512_ssse3 and sm4-generic.

systemd usually crashes and restarts when its journal becomes full from
all the tcrypt printk messages. Without the patches, that triggered more
RCU stall reports and soft lockups; with the patches, only userspace
seems perturbed.


Robert Elliott (13):
  x86:  protect simd.h header file
  x86: add yield FPU context utility function
  crypto: x86/sha - yield FPU context during long loops
  crypto: x86/crc - yield FPU context during long loops
  crypto: x86/sm3 - yield FPU context during long loops
  crypto: x86/ghash - use u8 rather than char
  crypto: x86/ghash - restructure FPU context saving
  crypto: x86/ghash - yield FPU context during long loops
  crypto: x86/poly - yield FPU context only when needed
  crypto: x86/aegis - yield FPU context during long loops
  crypto: x86/blake - yield FPU context only when needed
  crypto: x86/chacha - yield FPU context only when needed
  crypto: x86/aria - yield FPU context only when needed

 arch/x86/crypto/aegis128-aesni-glue.c      |  49 ++++++---
 arch/x86/crypto/aria_aesni_avx_glue.c      |   7 +-
 arch/x86/crypto/blake2s-glue.c             |  41 +++----
 arch/x86/crypto/chacha_glue.c              |  22 ++--
 arch/x86/crypto/crc32-pclmul_glue.c        |  49 +++++----
 arch/x86/crypto/crc32c-intel_glue.c        | 118 ++++++++++++++------
 arch/x86/crypto/crct10dif-pclmul_glue.c    |  65 ++++++++---
 arch/x86/crypto/ghash-clmulni-intel_asm.S  |   6 +-
 arch/x86/crypto/ghash-clmulni-intel_glue.c |  37 +++++--
 arch/x86/crypto/nhpoly1305-avx2-glue.c     |  22 ++--
 arch/x86/crypto/nhpoly1305-sse2-glue.c     |  22 ++--
 arch/x86/crypto/poly1305_glue.c            |  47 ++++----
 arch/x86/crypto/polyval-clmulni_glue.c     |  46 +++++---
 arch/x86/crypto/sha1_avx2_x86_64_asm.S     |   6 +-
 arch/x86/crypto/sha1_ni_asm.S              |   8 +-
 arch/x86/crypto/sha1_ssse3_glue.c          | 120 +++++++++++++++++----
 arch/x86/crypto/sha256_ni_asm.S            |   8 +-
 arch/x86/crypto/sha256_ssse3_glue.c        | 115 ++++++++++++++++----
 arch/x86/crypto/sha512_ssse3_glue.c        |  89 ++++++++++++---
 arch/x86/crypto/sm3_avx_glue.c             |  34 +++++-
 arch/x86/include/asm/simd.h                |  23 ++++
 include/crypto/internal/blake2s.h          |   8 +-
 lib/crypto/blake2s-generic.c               |  12 +--
 23 files changed, 687 insertions(+), 267 deletions(-)

-- 
2.38.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ