[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <40zxfc4bqmz9s2t@ozlabs.org>
Date: Tue, 5 Jun 2018 00:10:32 +1000 (AEST)
From: Michael Ellerman <patch-notifications@...erman.id.au>
To: Christophe Leroy <christophe.leroy@....fr>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Scott Wood <oss@...error.net>
Cc: linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: powerpc/64: optimises from64to32()
On Tue, 2018-04-10 at 06:34:35 UTC, Christophe Leroy wrote:
> The current implementation of from64to32() gives a poor result:
>
> 0000000000000270 <.from64to32>:
> 270: 38 00 ff ff li r0,-1
> 274: 78 69 00 22 rldicl r9,r3,32,32
> 278: 78 00 00 20 clrldi r0,r0,32
> 27c: 7c 60 00 38 and r0,r3,r0
> 280: 7c 09 02 14 add r0,r9,r0
> 284: 78 09 00 22 rldicl r9,r0,32,32
> 288: 7c 00 4a 14 add r0,r0,r9
> 28c: 78 03 00 20 clrldi r3,r0,32
> 290: 4e 80 00 20 blr
>
> This patch modifies from64to32() to operate in the same
> spirit as csum_fold()
>
> It swaps the two 32-bit halves of sum then it adds it with the
> unswapped sum. If there is a carry from adding the two 32-bit halves,
> it will carry from the lower half into the upper half, giving us the
> correct sum in the upper half.
>
> The resulting code is:
>
> 0000000000000260 <.from64to32>:
> 260: 78 60 00 02 rotldi r0,r3,32
> 264: 7c 60 1a 14 add r3,r0,r3
> 268: 78 63 00 22 rldicl r3,r3,32,32
> 26c: 4e 80 00 20 blr
>
> Signed-off-by: Christophe Leroy <christophe.leroy@....fr>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/55a0edf083022e402042255a0afb03
cheers
Powered by blists - more mailing lists