lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <20231227-optimize_checksum-v14-1-ddfd48016566@rivosinc.com> Date: Wed, 27 Dec 2023 09:38:00 -0800 From: Charlie Jenkins <charlie@...osinc.com> To: Charlie Jenkins <charlie@...osinc.com>, Palmer Dabbelt <palmer@...belt.com>, Conor Dooley <conor@...nel.org>, Samuel Holland <samuel.holland@...ive.com>, David Laight <David.Laight@...lab.com>, Xiao Wang <xiao.w.wang@...el.com>, Evan Green <evan@...osinc.com>, Guo Ren <guoren@...nel.org>, linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org Cc: Paul Walmsley <paul.walmsley@...ive.com>, Albert Ou <aou@...s.berkeley.edu>, Arnd Bergmann <arnd@...db.de>, David Laight <david.laight@...lab.com> Subject: [PATCH v14 1/5] asm-generic: Improve csum_fold This csum_fold implementation introduced into arch/arc by Vineet Gupta is better than the default implementation on at least arc, x86, and riscv. Using GCC trunk and compiling non-inlined version, this implementation has 41.6667%, 25% fewer instructions on riscv64, x86-64 respectively with -O3 optimization. Most implmentations override this default in asm, but this should be more performant than all of those other implementations except for arm which has barrel shifting and sparc32 which has a carry flag. Signed-off-by: Charlie Jenkins <charlie@...osinc.com> Reviewed-by: David Laight <david.laight@...lab.com> --- include/asm-generic/checksum.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/asm-generic/checksum.h b/include/asm-generic/checksum.h index 43e18db89c14..ad928cce268b 100644 --- a/include/asm-generic/checksum.h +++ b/include/asm-generic/checksum.h @@ -2,6 +2,8 @@ #ifndef __ASM_GENERIC_CHECKSUM_H #define __ASM_GENERIC_CHECKSUM_H +#include <linux/bitops.h> + /* * computes the checksum of a memory block at buff, length len, * and adds in "sum" (32-bit) @@ -31,9 +33,7 @@ extern __sum16 ip_fast_csum(const void *iph, unsigned int ihl); static inline __sum16 csum_fold(__wsum csum) { u32 sum = (__force u32)csum; - sum = (sum & 0xffff) + (sum >> 16); - sum = (sum & 0xffff) + (sum >> 16); - return (__force __sum16)~sum; + return (__force __sum16)((~sum - ror32(sum, 16)) >> 16); } #endif -- 2.43.0
Powered by blists - more mailing lists