lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAGqmi75gqrX5y2R2Lsv2ifczERGEjwhuSPF800NjaqXz+MDH9A@mail.gmail.com>
Date:   Thu, 4 Jan 2018 04:25:19 +0300
From:   Timofey Titovets <nefelim4ag@...il.com>
To:     Linux Kernel <linux-kernel@...r.kernel.org>
Subject: [RFC] Replace in memory hash: jhash by xxhash in kernel

Hi,
2 month ago.
I start topic about replace jhash with xxhash.

That a another topic, about replace replace in memory hashing with xxhash.
Or at least make some light on that.

I use simple printk() in jhash/jhash2, to get correct input sizes,
so, at least on x86_64 systems, most of inputs are:
16 byte (80%)
36 byte long (15%)
12 byte (5%)

Now, i do some more careful testing of imported kernel code for xxhash
and jhash2.

TL;DR version:
1. Links to more analysis from SMHasher, at end of email [1][2][3]
    Both: xxh32/lookup3 (jhash)
    have some collision problems, but in different tests.
    xxh32 show less collisions.
2. xxh32 slightly faster then jhash, on input sizes >= 16
    I propose that for 32 bit targets.
3. xxh64 at least, on x86_64 hardware much faster,
    on any input size.
    So that can make a sense to just cut xxh64 output to 32 bit by some way.
    I propose that for 64 bit targets.
4. I just try show some light on that, and put some attention on
    that jhash сan retire.

---
Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
gcc (GCC) 7.2.1 20171224

Perf results (I use that [4]):
- - -
input size: 3, loop count: 268435456
jhash:    0xf037dcad            time:   2224 ms, th: 361.95 MiB/s
xxhash32: 0x304ce180            time:   3232 ms, th: 249.10 MiB/s
xxhash64: 0xa089a4837a8826cc    time:   1572 ms, th: 512.03 MiB/s
- - -
input size: 4, loop count: 268435456
jhash2:   0x7ad8a872            time:   1963 ms, th: 546.83 MiB/s
xxhash32: 0x9d9077b3            time:   2348 ms, th: 457.13 MiB/s
xxhash64: 0xed798523670b4a54    time:   1318 ms, th: 814.60 MiB/s
- - -
input size: 8, loop count: 268435456
jhash2:   0x12dc3271            time:   2145 ms, th: 1000.87 MiB/s
xxhash32: 0x6676ea3f            time:   2788 ms, th: 770.01 MiB/s
xxhash64: 0x3a773d548cc10292    time:   1570 ms, th: 1367.19 MiB/s
- - -
input size: 11, loop count: 268435456
jhash:    0x4dea7819            time:   2660 ms, th: 1109.68 MiB/s
xxhash32: 0x91cfb73f            time:   4090 ms, th: 721.87 MiB/s
xxhash64: 0x7590a5e838491ace    time:   2209 ms, th: 1336.47 MiB/s
- - -
input size: 12, loop count: 268435456
jhash2:   0x9d2da46d            time:   2114 ms, th: 1523.62 MiB/s
xxhash32: 0x26ab91fa            time:   3221 ms, th: 999.90 MiB/s
xxhash64: 0xeb86800a5d9b1af7    time:   1746 ms, th: 1844.45 MiB/s
- - -
input size: 16, loop count: 268435456
jhash2:   0xf00c9739            time:   3416 ms, th: 1256.99 MiB/s
xxhash32: 0xf96996d1            time:   2999 ms, th: 1432.03 MiB/s
xxhash64: 0x80f029d7b31f5dd9    time:   1878 ms, th: 2286.08 MiB/s
- - -
input size: 17, loop count: 268435456
jhash:    0x65324057            time:   3501 ms, th: 1303.36 MiB/s
xxhash32: 0xe5c37673            time:   3423 ms, th: 1332.97 MiB/s
xxhash64: 0x1fa8c1eca000e164    time:   2127 ms, th: 2144.53 MiB/s
- - -
input size: 33, loop count: 268435456
jhash:    0xf3e1d1ca            time:   5134 ms, th: 1725.31 MiB/s
xxhash32: 0x1189eda6            time:   4096 ms, th: 2162.65 MiB/s
xxhash64: 0x2bb41aaafe8912e3    time:   3042 ms, th: 2911.09 MiB/s
- - -
input size: 36, loop count: 268435456
jhash2:   0x7a15e2d1            time:   4959 ms, th: 1948.40 MiB/s
xxhash32: 0x37b3865c            time:   4095 ms, th: 2359.54 MiB/s
xxhash64: 0x3e1f77ca4a6bda0f    time:   3006 ms, th: 3213.88 MiB/s
- - -
input size: 64, loop count: 268435456
jhash2:   0x491975d1            time:   9142 ms, th: 1879.13 MiB/s
xxhash32: 0xcba05387            time:   5050 ms, th: 3401.66 MiB/s
xxhash64: 0x11c3da4a085273e3    time:   3475 ms, th: 4942.48 MiB/s
- - -
input size: 67, loop count: 268435456
jhash:    0x6e254489            time:   9301 ms, th: 1933.56 MiB/s
xxhash32: 0x6d4a9fb4            time:   6373 ms, th: 2821.81 MiB/s
xxhash64: 0x283bbf10f5a093c2    time:   4245 ms, th: 4235.91 MiB/s

xxhash64 are absolutely winner.
jhash are faster then xxh32 on small inputs < 16 byte.
But on most usage xxh32 must show +14..21%.

Replacing jhash2 with xxh64 must show +40..110% on 64bit platforms.

Based on analytics from SMHasher [1][2],
lookup3 have more problems then xxhash32,

Like bad distribution, collisions, problems with altering input
on several bit changes.

xxhash32,  have only one problem, small more collisions on
short keys 128-256 bits with diff up to 4-3 bit.

I did a simple testing, if all will works, at least,
if jhash will be replaced by xxhash in kernel,
Nothing bad happens.

I add xxhash header to jhash.h, and add return xxh32(),
on top of jhash.
Kernel boot and works, hooray.

I did not try call folks to "use sed and let's get jhash away from kernel!",
jhash are good enough, and it's works for years.

I just try show some light on that, and put some attention on
that jhash сan retire.

So if someone interested, you can just try xxhash on your subsystems.

(I'm prefer to create a helper for choice xxhash32/64,
depends on target, and then use that, in things like rhashtable.)

Thanks!

P.S.
Yes, I'm understood what that can be useful if i'm can provide
some perf numbers of kernel, but i'm just don't see a way how ot properly
do that, because jhash are integrated in many different places.

i.e. some patches and perf numbers for replacing jhash with xxhash in ksm,
already provided, but above mail talks about different case.

---
[1] - https://github.com/rurban/smhasher/blob/master/doc/lookup3
[2] - https://github.com/rurban/smhasher/blob/master/doc/xxHash32
[3] - https://github.com/rurban/smhasher/blob/master/doc/xxHash64
[4] - https://github.com/Nefelim4ag/jhash_vs_xxhash

-- 
Have a nice day,
Timofey.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ