[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACXcFmkvKTzxEzD911ZZq_D_=gv8pcMhnCbqqxn0MrreTaQq+A@mail.gmail.com>
Date: Sun, 12 Jun 2016 08:18:23 -0400
From: Sandy Harris <sandyinchina@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Darren Hart <darren@...art.com>,
Michael Kerrisk <mtk.manpages@...glemail.com>,
Davidlohr Bueso <dave@...olabs.net>, Chris Mason <clm@...com>,
"Carlos O'Donell" <carlos@...hat.com>,
Torvald Riegel <triegel@...hat.com>,
Eric Dumazet <edumazet@...gle.com>
Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism
On Thu, Apr 28, 2016 at 10:25 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> It's the hashes that _look_ like they might be good hashes, but
> there's not a lot of analysis behind it, that I would worry about. The
> simple prime modulus _should_ be fine, but at the same time I kind of
> suspect we can do better. Especially since it has two multiplications.
>
> Looking around, there's
>
> http://burtleburtle.net/bob/hash/integer.html
>
> and that 32-bit "full avalanche" hash in six shifts looks like it
> could be better. You wouldn't want to inline it, but the point of a
> full avalanche bit mixing _should_ be that you could avoid the whole
> "upper bits" part, and it should work independently of the target set
> size.
>
> So if that hash works better, it would be a pretty good replacement
> option for hash_int().
>
> There is also
>
> https://gist.github.com/badboy/6267743
>
> that has a 64 bit to 32 bit hash function that might be useful for
> "hash_long()".
>
> Most of the people who worry about hashes tend to have strings to
> hash, not just a single word like a pointer, but there's clearly
> people around who have tried to search for good hashes that really
> spread out the bits.
>
> Linus
Here's another possibility, from my GPL code at:
https://github.com/sandy-harris/maxwell
Not very efficient -- two each of 32-bit multiply & modulo
in most cases -- but provably good mixing.
/*
Quasi-Hadamard transform
My own invention
Goal is to mix a 32-bit object
so that each output bit depends
on every input bit
Underlying primitive is IDEA
multiplication which mixes
a pair of 16-bit objects
This is analogous to the
pseudo-Hadamard transform
(PHT) originally from the
SAFER cipher, later in
Twofish and others
Conceptually, a two-way PHT
on a,b is:
x = a + b
y = a + 2b
a = x
b = y
This is reversible; it loses
no information. Each output
word depends on both inputs.
A PHT can be implemented as
a += b
b += a
which is faster and avoids
using intermediate variables
QHT is the same thing using
IDEA multiplication instead
of addition, calculating a*b
and a*b^2 instead of a+b and
a+2b
IDEA multiplication operates
on 16-bit chunks and makes
every output bit depend on
all input bits. Therefore
QHT is close to an ideal
mixer for 32-bit words.
*/
u32 qht(u32 x)
{
u32 a, b ;
a = x >> 16 ; // high 16 bits
b = x & 0xffff ; // low 16
a = idea(a,b) ; // a *= b
b = idea(a,b) ; // b *= a
return( (a<<16) | b) ;
}
/*
IDEA multiplication
borrowed from the IDEA cipher
*/
#define MAX (1<<16)
#define MOD (MAX+1)
u32 idea(u32 a, u32 b)
{
u32 x ;
// make sure they are in range
a %= MOD ;
b %= MOD ;
// special cases
if( (a == 0) && (b == 0))
return(1) ;
else if( a == 0)
return(MOD - b) ;
else if( b == 0)
return(MOD - a) ;
// non-special
else {
x = (a*b) % MOD ;
if(x == MAX)
return(0) ;
else return(x) ;
}
}
Powered by blists - more mailing lists