[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231027231036.GM800259@ZenIV>
Date: Sat, 28 Oct 2023 00:10:36 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Charlie Jenkins <charlie@...osinc.com>
Cc: Palmer Dabbelt <palmer@...belt.com>,
Conor Dooley <conor@...nel.org>,
Samuel Holland <samuel.holland@...ive.com>,
David Laight <David.Laight@...lab.com>,
Xiao Wang <xiao.w.wang@...el.com>,
Evan Green <evan@...osinc.com>,
linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-arch@...r.kernel.org,
Paul Walmsley <paul.walmsley@...ive.com>,
Albert Ou <aou@...s.berkeley.edu>,
Arnd Bergmann <arnd@...db.de>
Subject: Re: [PATCH v8 1/5] asm-generic: Improve csum_fold
On Fri, Oct 27, 2023 at 03:43:51PM -0700, Charlie Jenkins wrote:
> /*
> * computes the checksum of a memory block at buff, length len,
> * and adds in "sum" (32-bit)
> @@ -31,9 +33,7 @@ extern __sum16 ip_fast_csum(const void *iph, unsigned int ihl);
> static inline __sum16 csum_fold(__wsum csum)
> {
> u32 sum = (__force u32)csum;
> - sum = (sum & 0xffff) + (sum >> 16);
> - sum = (sum & 0xffff) + (sum >> 16);
> - return (__force __sum16)~sum;
> + return (__force __sum16)((~sum - ror32(sum, 16)) >> 16);
> }
Will (~(sum + ror32(sum, 16))>>16 produce worse code than that?
Because at least with recent gcc this will generate the exact thing
you get from arm inline asm...
Powered by blists - more mailing lists