lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 13 Feb 2022 02:39:06 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Christophe Leroy' <christophe.leroy@...roup.eu>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH] net: Remove branch in csum_shift()

From: Christophe Leroy
> Sent: 11 February 2022 08:48
> 
> Today's implementation of csum_shift() leads to branching based on
> parity of 'offset'
> 
> 	000002f8 <csum_block_add>:
> 	     2f8:	70 a5 00 01 	andi.   r5,r5,1
> 	     2fc:	41 a2 00 08 	beq     304 <csum_block_add+0xc>
> 	     300:	54 84 c0 3e 	rotlwi  r4,r4,24
> 	     304:	7c 63 20 14 	addc    r3,r3,r4
> 	     308:	7c 63 01 94 	addze   r3,r3
> 	     30c:	4e 80 00 20 	blr
> 
> Use first bit of 'offset' directly as input of the rotation instead of
> branching.
> 
> 	000002f8 <csum_block_add>:
> 	     2f8:	54 a5 1f 38 	rlwinm  r5,r5,3,28,28
> 	     2fc:	20 a5 00 20 	subfic  r5,r5,32
> 	     300:	5c 84 28 3e 	rotlw   r4,r4,r5
> 	     304:	7c 63 20 14 	addc    r3,r3,r4
> 	     308:	7c 63 01 94 	addze   r3,r3
> 	     30c:	4e 80 00 20 	blr
> 
> And change to left shift instead of right shift to skip one more
> instruction. This has no impact on the final sum.
> 
> 	000002f8 <csum_block_add>:
> 	     2f8:	54 a5 1f 38 	rlwinm  r5,r5,3,28,28
> 	     2fc:	5c 84 28 3e 	rotlw   r4,r4,r5
> 	     300:	7c 63 20 14 	addc    r3,r3,r4
> 	     304:	7c 63 01 94 	addze   r3,r3
> 	     308:	4e 80 00 20 	blr

That is ppc64.
What happens on x86-64?

Trying to do the same in the x86 ipcsum code tended to make the code worse.
(Although that test is for an odd length fragment and can just be removed.)

	David

> 
> Signed-off-by: Christophe Leroy <christophe.leroy@...roup.eu>
> ---
>  include/net/checksum.h | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/include/net/checksum.h b/include/net/checksum.h
> index 5218041e5c8f..9badcd5532ef 100644
> --- a/include/net/checksum.h
> +++ b/include/net/checksum.h
> @@ -83,9 +83,7 @@ static inline __sum16 csum16_sub(__sum16 csum, __be16 addend)
>  static inline __wsum csum_shift(__wsum sum, int offset)
>  {
>  	/* rotate sum to align it with a 16b boundary */
> -	if (offset & 1)
> -		return (__force __wsum)ror32((__force u32)sum, 8);
> -	return sum;
> +	return (__force __wsum)rol32((__force u32)sum, (offset & 1) << 3);
>  }
> 
>  static inline __wsum
> --
> 2.34.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ