[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1457502647.4067.38.camel@perches.com>
Date: Tue, 08 Mar 2016 21:50:47 -0800
From: Joe Perches <joe@...ches.com>
To: Alexander Duyck <alexander.duyck@...il.com>
Cc: Alexander Duyck <aduyck@...antis.com>,
Netdev <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>
Subject: Re: [net-next PATCH] csum: Update csum_block_add to use rotate
instead of byteswap
On Tue, 2016-03-08 at 21:23 -0800, Alexander Duyck wrote:
> On Tue, Mar 8, 2016 at 3:25 PM, Joe Perches <joe@...ches.com> wrote:
> > On Tue, 2016-03-08 at 14:42 -0800, Alexander Duyck wrote:
> > > The code for csum_block_add was doing a funky byteswap to swap the even and
> > > odd bytes of the checksum if the offset was odd. Instead of doing this we
> > > can save ourselves some trouble and just shift by 8 as this should have the
> > > same effect in terms of the final checksum value and only requires one
> > > instruction.
> > 3 instructions?
> I was talking about just the one ror vs mov, shl, shr, and ,and, add.
>
> I assume when you say 3 you are including the test and either some
> form of conditional move or jump?
Yeah, instruction count also depends on architecture (arm/x86/ppc...)
> > > diff --git a/include/net/checksum.h b/include/net/checksum.h
[]
> > > @@ -88,8 +88,10 @@ static inline __wsum
> > > csum_block_add(__wsum csum, __wsum csum2, int offset)
> > > {
> > > u32 sum = (__force u32)csum2;
> > > - if (offset&1)
> > > - sum = ((sum&0xFF00FF)<<8)+((sum>>8)&0xFF00FF);
> > > +
> > > + if (offset & 1)
> > > + sum = (sum << 24) + (sum >> 8);
> > Maybe use ror32(sum, 8);
> I was actually thinking I could use something like this. I didn't
> realize it was even available.
Now you know: bitops.h
> > or maybe something like:
> >
> > {
> > u32 sum;
> >
> > /* rotated csum2 of odd offset will be the right checksum */
> > if (offset & 1)
> > sum = ror32((__force u32)csum2, 8);
> > else
> > sum = (__force u32)csum2;
> >
> Any specific reason for breaking it up like this? It seems like it
> was easier to just have sum be assigned first and then rotating it if
> needed. What is gained by splitting the assignment up over two
> different calls?
It's only for reader clarity where a comment could be useful.
The compiler output shouldn't change.
Powered by blists - more mailing lists