[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c8122d29be854e4b821c1e242ab8bcc1@AcuMS.aculab.com>
Date: Thu, 23 Jul 2020 14:30:10 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Al Viro' <viro@...iv.linux.org.uk>
CC: 'Linus Torvalds' <torvalds@...ux-foundation.org>,
"'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>,
"'linux-arch@...r.kernel.org'" <linux-arch@...r.kernel.org>
Subject: RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead
of 0 as initial sum
> I had to replace the ror32() with __builtin_bswap32().
> The kernel object do contain the 'ror' instruction - even though I
> didn't find the asm for it.
Looking at some instruction timings ror32() and bswap32()
seem to need one of the same execution ports.
However on Intel cpus bswap64() takes 2 clocks but the ror64()
instructions only take 1.
AMD cpus are more symmetric and run all variant in 1 clock.
So ror32() is probably preferable in case anyone copies the
code into somewhere with 64bit checksum value.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists