[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52AB1F7C.1050606@redhat.com>
Date: Fri, 13 Dec 2013 15:53:48 +0100
From: Francesco Fusco <ffusco@...hat.com>
To: David Laight <David.Laight@...LAB.COM>
CC: Jesse Gross <jesse@...ira.com>, netdev <netdev@...r.kernel.org>,
dev@...nvswitch.org, Daniel Borkmann <dborkman@...hat.com>,
Thomas Graf <tgraf@...hat.com>
Subject: Re: [PATCH net-next v2 2/2] net: ovs: use CRC32 accelerated flow
hash if available
On 12/13/2013 11:01 AM, David Laight wrote:
> My thoughts exactly.
> Given this is a hash it could crc alternate words into separate
> accumulators and the combine the values at the end.
> That way you are still doing sequential accesses to the data.
> (The crc instruction might be better than an xor for the combine.)
> If the cpu has 3 execution units that can do crc, use them all.
>
> It might be that the hash function is now an insignificant cost.
> Looking at how much hashing the data twice (discarding the first
> result - assign to global volatile data) slows things down can
> help determine this.
On i7 CPUs the crc32/crc64 instructions have a throughput
of 1 cycle and a latency of 3 cycles [1], which means that 1) with this
code we pay 3 clocks per crc32 instruction, and 2) we could compute
three CRCs in parallel, each processing 1/3 of the data during the same
clock. This could in theory provide 3x the performance.
For short keys (~100 bytes and less) there is chance that the 3x
theoretical speedup will be destroyed by the additional code required
to compute boundaries, xor the results, etc. But as I already mentioned,
this is something to try.
[1]
http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-paper.pdf
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists