linux-kernel - RE: [PATCH 2/2] powerpc/32: Implement csum

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a87eb9e5bb6d483f8352ccb4b7374286@AcuMS.aculab.com>
Date:   Sun, 13 Feb 2022 03:01:24 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Christophe Leroy' <christophe.leroy@...roup.eu>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        "Michael Ellerman" <mpe@...erman.id.au>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH 2/2] powerpc/32: Implement csum_sub

From: Christophe Leroy
> Sent: 11 February 2022 10:25
> 
> When building kernel with CONFIG_CC_OPTIMISE_FOR_SIZE, several
> copies of csum_sub() are generated, with the following code:
> 
> 	00000170 <csum_sub>:
> 	     170:	7c 84 20 f8 	not     r4,r4
> 	     174:	7c 63 20 14 	addc    r3,r3,r4
> 	     178:	7c 63 01 94 	addze   r3,r3
> 	     17c:	4e 80 00 20 	blr
> 
> Let's define a PPC32 version with subc/addme, and for it's inlining.
> 
> It will return 0 instead of 0xffffffff when subtracting 0x80000000 to itself,
> this is not an issue as 0 and ~0 are equivalent, refer to RFC 1624.

They are not always equivalent.
In particular in the UDP checksum field one of them is (0?) 'checksum not calculated'.

I think all the Linux functions have to return a non-zero value (for non-zero input).

If the csum is going to be converted to 16 bit, inverted, and put into a packet
the code usually has to have a check that changes 0 to 0xffff.
However if the csum functions guarantee never to return zero they can feed
an extra 1 into the first csum_partial() then just invert and add 1 at the end.
Because (~csum_partion(buffer, 1) + 1) is the same as ~csum_partial(buffer, 0)
except when the buffer's csum is 0xffffffff.

I did do some experiments and the 64bit value can be reduced directly to
16bits using '% 0xffff'.
This is different because it returns 0 not 0xffff.
However gcc 'randomly' picks between the fast 'multiply by reciprocal'
and slow divide instruction paths.
The former is (probably) faster than reducing using shifts and adc.
The latter definitely slower.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)