[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e8b84bcaee634b53bee797aa041824a4@AcuMS.aculab.com>
Date: Fri, 6 Mar 2020 17:12:05 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Eric Dumazet' <eric.dumazet@...il.com>,
'Willem de Bruijn' <willemdebruijn.kernel@...il.com>,
Yadu Kishore <kyk.segfault@...il.com>
CC: David Miller <davem@...emloft.net>,
Network Development <netdev@...r.kernel.org>
Subject: RE: [PATCH v2] net: Make skb_segment not to compute checksum if
network controller supports checksumming
From: Eric Dumazet
> Sent: 05 March 2020 17:20
>
> On 3/5/20 9:00 AM, David Laight wrote:
> > From: Willem de Bruijn
> >> Sent: 05 March 2020 16:07
> > ..
> >> It seems do_csum is called because csum_partial_copy executes the
> >> two operations independently:
> >>
> >> __wsum
> >> csum_partial_copy(const void *src, void *dst, int len, __wsum sum)
> >> {
> >> memcpy(dst, src, len);
> >> return csum_partial(dst, len, sum);
> >> }
> >
> > And do_csum() is superbly horrid.
> > Not the least because it is 32bit on 64bit systems.
>
> There are many versions, which one is discussed here ?
>
> At least the current one seems to be 64bit optimized.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5777eaed566a1d63e344d3dd
> 8f2b5e33be20643e
I was looking at the generic one in $(SRC)/lib/checksum.c.
FWIW I suspect the fastest code on pre sandy bridge 64bit intel cpus
(where adc is 2 clocks) is to do a normal 'add', shift the carries
into a 64bit register and do a software 'popcnt' every 512 bytes.
That may run at 8 bytes/clock + the popcnt.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists