[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110809112114.3943.qmail@science.horizon.com>
Date: 9 Aug 2011 07:21:14 -0400
From: "George Spelvin" <linux@...izon.com>
To: akpm@...ux-foundation.org, fzago@...temfabricworks.com,
joakim.tjernlund@...nsmode.se, linux-kernel@...r.kernel.org,
linux@...izon.com, rpearson@...temfabricworks.com
Subject: Re: [patch v3 6/7] crc32: add-slicing-by-8.diff
While writing up some documentation of this algorithm, I came up with
a potential speedup. Or, at least, realized why slicing by more than
4 is so much faster than slicing by 4 or less.
Note that the inner loop of the algorithm is as follows:
+# define DO_CRC8a (tab[7][(q) & 255] ^ \
+ tab[6][(q >> 8) & 255] ^ \
+ tab[5][(q >> 16) & 255] ^ \
+ tab[4][(q >> 24) & 255])
+# define DO_CRC8b (tab[3][(q) & 255] ^ \
+ tab[2][(q >> 8) & 255] ^ \
+ tab[1][(q >> 16) & 255] ^ \
+ tab[0][(q >> 24) & 255])
+ for (--b; middle_len; --middle_len) {
+ u32 q;
+ q = crc ^ *++b;
+ crc = DO_CRC8a;
+ q = *++b;
+ crc ^= DO_CRC8b;
}
Note the data dependencies: DO_CRC8a depends on the
previous crc, which depends on the previous DO_CRC8b.
But DO_CRC8b does not depend on anything except the
input data at *++b.
It would increase parallelism to schedule DO_CRC8b before DO_CRC8a,
to start those loads before the previous crc value is available.
Maybe the compiler and/pr processor can find this parallelism already,
but if not, it might be useful to try reordering it:
# define DO_CRC8a(x) (tab[7][(x) & 255] ^ \
tab[6][((x) >> 8) & 255] ^ \
tab[5][((x) >> 16) & 255] ^ \
tab[4][((x) >> 24) & 255])
# define DO_CRC8b(x) (tab[3][(x) & 255] ^ \
tab[2][((x) >> 8) & 255] ^ \
tab[1][((x) >> 16) & 255] ^ \
tab[0][((x) >> 24) & 255])
for ( ; middle_len; --middle_len, b += 2) {
u32 q = DO_CRC8b(b[1]);
crc ^= b[0];
crc = q ^ DO_CRC8a(crc);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists