netdev - Re: Extensible hashing and RCU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200702191614.49666.dada1@cosmosbay.com>
Date:	Mon, 19 Feb 2007 16:14:49 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Evgeniy Polyakov <johnpol@....mipt.ru>
Cc:	akepner@....com, linux@...izon.com, davem@...emloft.net,
	netdev@...r.kernel.org, bcrl@...ck.org
Subject: Re: Extensible hashing and RCU

On Monday 19 February 2007 15:25, Evgeniy Polyakov wrote:
> On Mon, Feb 19, 2007 at 03:14:02PM +0100, Eric Dumazet (dada1@...mosbay.com) 
wrote:
> > > Forget about cache misses and cache lines - we have a hash table, only
> > > part of which is used (part for time-wait sockets, part for established
> > > ones).
> >
> > No you didnt not read my mail. Current ehash is not as decribed by you.
>
> I did.
> And I also said that my tests do not have timewait sockets at all - I
> removed sk_for_each and so on, which should effectively increase lookup
> time twice on busy system with lots of created/removed sockets per
> timeframe (that is theory from my side already).
> Anyway, I ran the same test with increased table too.
>
> > > Anyway, even with 2^20 (i.e. when the whole table is only used for
> > > established sockets) search time is about 360-370 nsec on 3.7 GHz Core
> > > Duo (only one CPU is used) with 2 GB of ram.
> >
> > Your tests are user land, so unfortunatly are biased...
> >
> > (Unless you use hugetlb data ?)
>
> No I do not. But the same can be applied to trie test - it is also
> performed in userspace and thus suffers from possible swapping/cache
> flushing and so on.
>
> And I doubt moving test into kernel will suddenly end up with 10 times
> increased rates.

At least some architectures pay a high price using vmalloc() instead of 
kmalloc(), and TLB misses means something for them. Not everybody has the 
latest Intel cpu. Normally, ehash table is using huge pages.

>
> Anyway, trie test (broken implementation) is two times slower than hash
> table (resized already), and it does not include locking isses of the
> hash table access and further scalability issues.
>

You mix apples and oranges. We already know locking has nothing to do with 
hashing or trie-ing. We *can* put RCU on top of the existing ehash. We also 
can add hash resizing if we really care.


> I think I need to fix my trie implementation to fully show its
> potential, but original question was why tree/trie based implementation
> is not considered at all although it promises better performance and
> scalability.

Because you mix performance and scalability. Thats not exactly the same.
Sometime, high performance means *suboptimal* scalability.

Because O(1) is different of O(log(N)) ?
if N = 2^20, it certainly makes a difference.
Yes, 1% of chains might have a length > 10, but yet 99% of the lookups are 
touching less than 4 cache lines.
With a binary tree, log(2^20) is 20. or maybe not ? If you tell me it's 4, I 
will be very pleased.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html