[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOLP8p67SEW0S2UuNDqEPSfyjGpDBmMWd7_3k6eGQMf9NkN_fA@mail.gmail.com>
Date: Thu, 26 Mar 2015 12:58:09 -0700
From: Bill Cox <waywardgeek@...il.com>
To: "discussions@...sword-hashing.net" <discussions@...sword-hashing.net>
Subject: Dumb hashing idea
In this last thread where I created dummy "worker" threads, I was very
surprised to see that when I have very many workers, I could achieve about
1/2 the total memory bandwidth as I do with TwoCats, which is optimized for
block-based hashing speed, and suffers very little from cache miss
penalties.
In short, with 32 threads, I can do a brain-dead simple hashing algorithm
with complete random access to 4 GiB of memory in about 0.7 seconds on my
desktop, doing 8GiB of total transfer, for 11.4 GiB/s of memory bandwidth,
which is pretty good. This compares to TwoCats on 4 threads in 0.35
seconds on the desktop and about 0.7 seconds on my laptop.
I don't know how these CPUs are able to get around the cache-miss
penalties, simply by having many outstanding threads to run... maybe
there's a bug in my code? Maybe the pthreads are sorting based on CAS
address? That's some crazy memory controller if so! I've attacked the
code.
Bill
Content of type "text/html" skipped
View attachment "randhash.c" of type "text/x-csrc" (2112 bytes)
Powered by blists - more mailing lists