[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070221095755.GB308@2ka.mipt.ru>
Date: Wed, 21 Feb 2007 12:57:55 +0300
From: Evgeniy Polyakov <johnpol@....mipt.ru>
To: Eric Dumazet <dada1@...mosbay.com>
Cc: "Michael K. Edwards" <medwards.linux@...il.com>,
David Miller <davem@...emloft.net>, akepner@....com,
linux@...izon.com, netdev@...r.kernel.org, bcrl@...ck.org
Subject: Re: Extensible hashing and RCU
On Wed, Feb 21, 2007 at 10:38:22AM +0100, Eric Dumazet (dada1@...mosbay.com) wrote:
> On Wednesday 21 February 2007 10:27, Evgeniy Polyakov wrote:
> > On Wed, Feb 21, 2007 at 10:15:11AM +0100, Eric Dumazet (dada1@...mosbay.com)
> wrote:
> > > On Wednesday 21 February 2007 09:54, Evgeniy Polyakov wrote:
> > > > I shown that numbers 4 times already, do you read mails and links?
> > > > Did you see an artifact Eric showed with his data?
> > >
> > > I showed all your thinking is wrong.
> >
> > You mix all the time distribution fairness and complexity of searching
> > for collisions. Jenkins hash is unfair - having Linux random generator
> > as attacker's tool we end up in the case, when jenkins hash has some
> > chains with 5 times longer length.
> >
> > Short history:
> > I showed that jenkins is unfair, you confirmed that with your data.
> > Another question is about complexity of searching for collisions - you
> > showed that with xor it is quite easy and with jenkins it is complex,
> > then I showed that it is not that complex to find collisions in jenkins
> > too.
>
> Again here is your 'test'
>
> enter_hash(u32 val)
> {
> counter[val & mask]++;
> }
>
> for (i = 0 ; i < 1000 ; i++)
> enter_hash(CONSTANT1)
> for (i = 0 ; i < 1000 ; i++)
> enter_hash(CONSTANT2)
>
> Oh my god, I have two chains with 1000 elems in it.
Drop me a phone of your drug diller - I want that crack too :)
Code is pretty simple:
hash = jhash_3words(const, const, random_of_2^16, const_or_random_per_run);
table[hash & (hash_size - 1)]++;
> Real data are :
> 1) XOR hash, with a load factor of 0.41
>
> Distribution of sockets/chain length
> [chain length]:number of sockets
> [0]:752255 0%
> [1]:208850 47.455%
> [2]:54281 72.1225%
> [3]:19236 85.235%
> [4]:8199 92.6869%
> [5]:3487 96.6485%
> [6]:1489 98.6785%
> [7]:515 99.4976%
> [8]:192 99.8466%
> [9]:53 99.955%
> [10]:14 99.9868%
> [11]:3 99.9943%
> [12]:1 99.997%
> [13]:1 100%
> total : 440101 sockets, Avg lookup cost=3.07896 cache lines
>
> 2) Jenkins hash, same load factor
> [0]:688813 0%
> [1]:289874 65.8653%
> [2]:60452 93.3372%
> [3]:8493 99.1266%
> [4]:879 99.9255%
> [5]:62 99.9959%
> [6]:3 100%
> total : 440101 sockets, Avg lookup cost=2.4175 cache lines
Please read my e-mail about fairness and different usage models I found
to be vulnerable.
Your data only shows that your feel yourself safe with jenkins hash,
while it has problems (as long as xor).
With described above usage case jenkins will have enrties with 50
elements in the chain - you have not received such packets yet, or you
use jhash_2words().
--
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists