lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Sun, 11 Feb 2018 16:51:45 +0200
From:   Amir Goldstein <amir73il@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Al Viro <viro@...iv.linux.org.uk>,
        Miklos Szeredi <miklos@...redi.hu>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC][PATCH] <linux/stringhash.h>: fix end_name_hash() for 64bit long

On Fri, Feb 9, 2018 at 8:33 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:

>
> I *would* love to get some kind of estimate on how this changes the
> hash distribution. Do you have anything like that? The way this
> function is designed to be used, the upper bits should be used for the
> hash bucket information, so doing some bucket size distribution on
> (say) the upper ~12 bits or something with some real string input
> would be really good.
>

I've added bucket counters and tracing number of entries / number of non
empty buckets and largest bucket to dcache (patch attached).
Increased d_hash_shift to take only 12 bits for hash table offset.
The results vary a bit per run, but below are some outputs from sample runs
of find / of 55K entries right after boot.

The bottom line is that, at least w.r.t this simplistic bucket analysis and this
specific workload, the patch does not seem to move the needle. The bucket
distribution of before and after patch are similar and also similar to the
word-at-a-time results.
Just for comparison, I added a run with noop end_name_hash() which shows
a worse bucket distribution.

Perhaps it would take a different workload, or different analysis to demonstrate
the alleged deficiencies of the current implementation? Not sure which evidence
will be required to justify the change, even if, the patch is correct in theory?
I wonder if we should just let this old code be... maybe add something to the
comment or remove the part about "try to avoid losing bits..."?

Thanks,
Amir.


----------------------------------------------------

x86_64, word-at-a-time, 12 bits:

[    0.008583] dcache: d_hash_shift = 20, nentries = 1, nbuckets = 1,
maxcount = 1
[    0.783683] dcache: d_hash_shift = 20, nentries = 130, nbuckets =
129, maxcount = 2
[    0.785336] dcache: d_hash_shift = 20, nentries = 314, nbuckets =
303, maxcount = 3
[    0.789104] dcache: d_hash_shift = 20, nentries = 818, nbuckets =
736, maxcount = 4
[    0.795435] dcache: d_hash_shift = 20, nentries = 1738, nbuckets =
1406, maxcount = 5
[    0.804935] dcache: d_hash_shift = 20, nentries = 3056, nbuckets =
2134, maxcount = 6
[    0.814014] dcache: d_hash_shift = 20, nentries = 4158, nbuckets =
2595, maxcount = 7
[    0.824848] dcache: d_hash_shift = 20, nentries = 5635, nbuckets =
3051, maxcount = 8
[    0.844848] dcache: d_hash_shift = 20, nentries = 8523, nbuckets =
3584, maxcount = 9
[    0.852072] dcache: d_hash_shift = 20, nentries = 9571, nbuckets =
3708, maxcount = 10
[    0.855598] dcache: d_hash_shift = 20, nentries = 10098, nbuckets =
3745, maxcount = 11
[    2.793453] dcache: d_hash_shift = 20, nentries = 13547, nbuckets =
3937, maxcount = 12
[    3.097325] dcache: d_hash_shift = 20, nentries = 14394, nbuckets =
3965, maxcount = 13
[    3.559967] dcache: d_hash_shift = 20, nentries = 16783, nbuckets =
4017, maxcount = 14
[   10.259225] dcache: d_hash_shift = 20, nentries = 22166, nbuckets =
4070, maxcount = 15
[   10.326126] dcache: d_hash_shift = 20, nentries = 24960, nbuckets =
4083, maxcount = 16
[   10.338824] dcache: d_hash_shift = 20, nentries = 25348, nbuckets =
4084, maxcount = 17
[   10.423085] dcache: d_hash_shift = 20, nentries = 29715, nbuckets =
4095, maxcount = 18
[   10.746374] dcache: d_hash_shift = 20, nentries = 30771, nbuckets =
4095, maxcount = 19

x86_64, byte at a time, before patch:

[    0.008624] dcache: d_hash_shift = 20, nentries = 1, nbuckets = 1,
maxcount = 1
[    0.797948] dcache: d_hash_shift = 20, nentries = 168, nbuckets =
167, maxcount = 2
[    0.803184] dcache: d_hash_shift = 20, nentries = 953, nbuckets =
858, maxcount = 3
[    0.809085] dcache: d_hash_shift = 20, nentries = 1690, nbuckets =
1390, maxcount = 4
[    0.818364] dcache: d_hash_shift = 20, nentries = 2868, nbuckets =
2082, maxcount = 5
[    0.826448] dcache: d_hash_shift = 20, nentries = 4016, nbuckets =
2609, maxcount = 6
[    0.844307] dcache: d_hash_shift = 20, nentries = 6525, nbuckets =
3357, maxcount = 7
[    0.848492] dcache: d_hash_shift = 20, nentries = 7121, nbuckets =
3473, maxcount = 8
[    0.850470] dcache: d_hash_shift = 20, nentries = 7419, nbuckets =
3514, maxcount = 9
[    2.781536] dcache: d_hash_shift = 20, nentries = 13484, nbuckets =
3984, maxcount = 10
[    3.082709] dcache: d_hash_shift = 20, nentries = 14329, nbuckets =
4012, maxcount = 11
[    3.116017] dcache: d_hash_shift = 20, nentries = 14411, nbuckets =
4011, maxcount = 12
[    3.736935] dcache: d_hash_shift = 20, nentries = 17509, nbuckets =
4054, maxcount = 13
[   17.557370] dcache: d_hash_shift = 20, nentries = 19928, nbuckets =
4073, maxcount = 14
[   17.587867] dcache: d_hash_shift = 20, nentries = 21488, nbuckets =
4083, maxcount = 15
[   17.622295] dcache: d_hash_shift = 20, nentries = 22798, nbuckets =
4085, maxcount = 16
[   17.642227] dcache: d_hash_shift = 20, nentries = 23505, nbuckets =
4086, maxcount = 17
[   17.722103] dcache: d_hash_shift = 20, nentries = 27107, nbuckets =
4092, maxcount = 18
[   17.749791] dcache: d_hash_shift = 20, nentries = 28353, nbuckets =
4093, maxcount = 19
[   17.958165] dcache: d_hash_shift = 20, nentries = 30687, nbuckets =
4095, maxcount = 20

x86_64, byte at a time, after patch:

[    0.008648] dcache: d_hash_shift = 20, nentries = 1, nbuckets = 1,
maxcount = 1
[    0.795676] dcache: d_hash_shift = 20, nentries = 157, nbuckets =
156, maxcount = 2
[    0.799493] dcache: d_hash_shift = 20, nentries = 619, nbuckets =
563, maxcount = 3
[    0.803741] dcache: d_hash_shift = 20, nentries = 1115, nbuckets =
928, maxcount = 4
[    0.814418] dcache: d_hash_shift = 20, nentries = 2619, nbuckets =
1894, maxcount = 5
[    0.817449] dcache: d_hash_shift = 20, nentries = 3053, nbuckets =
2095, maxcount = 6
[    0.838418] dcache: d_hash_shift = 20, nentries = 5981, nbuckets =
3159, maxcount = 7
[    0.843916] dcache: d_hash_shift = 20, nentries = 6790, nbuckets =
3340, maxcount = 8
[    0.854101] dcache: d_hash_shift = 20, nentries = 8263, nbuckets =
3565, maxcount = 9
[    0.864520] dcache: d_hash_shift = 20, nentries = 9745, nbuckets =
3760, maxcount = 10
[    2.799091] dcache: d_hash_shift = 20, nentries = 13273, nbuckets =
3954, maxcount = 11
[    3.201936] dcache: d_hash_shift = 20, nentries = 14576, nbuckets =
3988, maxcount = 12
[    3.996005] dcache: d_hash_shift = 20, nentries = 17902, nbuckets =
4051, maxcount = 13
[    9.847851] dcache: d_hash_shift = 20, nentries = 19322, nbuckets =
4068, maxcount = 14
[    9.858100] dcache: d_hash_shift = 20, nentries = 19891, nbuckets =
4074, maxcount = 15
[    9.889944] dcache: d_hash_shift = 20, nentries = 21424, nbuckets =
4082, maxcount = 16
[    9.912863] dcache: d_hash_shift = 20, nentries = 22145, nbuckets =
4087, maxcount = 17
[    9.940020] dcache: d_hash_shift = 20, nentries = 23126, nbuckets =
4089, maxcount = 18
[   10.016343] dcache: d_hash_shift = 20, nentries = 26093, nbuckets =
4094, maxcount = 19
[   10.072039] dcache: d_hash_shift = 20, nentries = 28712, nbuckets =
4094, maxcount = 20

x86_64, byte at a time, with NOOP end_name_hash() (i.e. take 32 LSB):

[    0.008609] dcache: d_hash_shift = 20, nentries = 1, nbuckets = 1,
maxcount = 1
[    0.081990] dcache: d_hash_shift = 20, nentries = 5, nbuckets = 4,
maxcount = 2
[    0.802145] dcache: d_hash_shift = 20, nentries = 31, nbuckets =
28, maxcount = 3
[    0.813503] dcache: d_hash_shift = 20, nentries = 554, nbuckets =
482, maxcount = 4
[    0.817755] dcache: d_hash_shift = 20, nentries = 1151, nbuckets =
932, maxcount = 5
[    0.817910] dcache: d_hash_shift = 20, nentries = 1175, nbuckets =
947, maxcount = 6
[    0.817987] dcache: d_hash_shift = 20, nentries = 1187, nbuckets =
952, maxcount = 7
[    0.838691] dcache: d_hash_shift = 20, nentries = 4152, nbuckets =
2502, maxcount = 8
[    0.848732] dcache: d_hash_shift = 20, nentries = 5564, nbuckets =
2959, maxcount = 9
[    0.848774] dcache: d_hash_shift = 20, nentries = 5570, nbuckets =
2960, maxcount = 10
[    0.872152] dcache: d_hash_shift = 20, nentries = 8949, nbuckets =
3548, maxcount = 11
[    0.879572] dcache: d_hash_shift = 20, nentries = 10020, nbuckets =
3647, maxcount = 12
[    0.893884] dcache: d_hash_shift = 20, nentries = 12009, nbuckets =
3803, maxcount = 13
[    0.897821] dcache: d_hash_shift = 20, nentries = 12570, nbuckets =
3827, maxcount = 14
[    0.898154] dcache: d_hash_shift = 20, nentries = 12615, nbuckets =
3827, maxcount = 15
[    0.898432] dcache: d_hash_shift = 20, nentries = 12654, nbuckets =
3829, maxcount = 16
[    0.898877] dcache: d_hash_shift = 20, nentries = 12717, nbuckets =
3830, maxcount = 17
[    0.906711] dcache: d_hash_shift = 20, nentries = 12821, nbuckets =
3838, maxcount = 18
[    0.906822] dcache: d_hash_shift = 20, nentries = 12822, nbuckets =
3838, maxcount = 19
[    0.906920] dcache: d_hash_shift = 20, nentries = 12823, nbuckets =
3838, maxcount = 20
[...]
[    3.743889] dcache: d_hash_shift = 20, nentries = 17112, nbuckets =
3992, maxcount = 91
[    4.113855] dcache: d_hash_shift = 20, nentries = 17927, nbuckets =
4000, maxcount = 92
[   11.915833] dcache: d_hash_shift = 20, nentries = 27988, nbuckets =
4084, maxcount = 93

View attachment "dcache_buckets.patch" of type "text/x-patch" (4381 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ