lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 02 May 2016 11:39:28 +0200
From:	Torvald Riegel <triegel@...hat.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Rik van Riel <riel@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	Darren Hart <darren@...art.com>,
	Michael Kerrisk <mtk.manpages@...glemail.com>,
	Davidlohr Bueso <dave@...olabs.net>, Chris Mason <clm@...com>,
	"Carlos O'Donell" <carlos@...hat.com>,
	Eric Dumazet <edumazet@...gle.com>
Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism

On Fri, 2016-04-29 at 16:51 -0700, Linus Torvalds wrote:
> On Fri, Apr 29, 2016 at 2:10 PM, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> > On Thu, Apr 28, 2016 at 11:32 AM, Linus Torvalds
> > <torvalds@...ux-foundation.org> wrote:
> >>
> >> hash_long/ptr really shouldn't care about the low bits being zero at
> >> all, because it should mix in all the bits (using a prime multiplier
> >> and taking the high bits).
> >
> > Looking at this assertion, it doesn't actually pan out very well at all.
> >
> > The 32-bit hash looks to be ok. Certainly not perfect, but not horrible.
> >
> > The 64-bit hash seems to be quite horribly bad with lots of values.
> 
> Ok, I have tried to research this a bit more. There's a lot of
> confusion here that causes the fact that hash_64() sucks donkey balls.
> 
> The basic method for the hashing method by multiplication is fairly
> sane. It's well-documented, and attributed to Knuth as the comment
> above it says.

Section 2.2 of this article might also be of interest:

M. Dietzfelbinger, T. Hagerup, J. Katajainen, and M. Penttonen. A Re-
liable Randomized Algorithm for the Closest-Pair Problem. Journal of
Algorithms, 25(1):19 – 51, 1997.

I don't know much about hash functions, but my understanding of this is
that one can do good hashing of words by multiplying with a random
number and using the most-significant bits of the lower-significant word
of the result.  Different random numbers may work better than others for
the input data (and some might be really awful), but the paper seems to
claim that one *can* find a good random number for all input data.  In
practice, this might mean that re-hashing with a different random number
after seeing too many conflicts in a hash table should eventually lead
to a good hashing (ie, because the *class* of such hash functions is
universal).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ