lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 05 Nov 2009 16:07:25 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Andi Kleen <andi@...stfloor.org>
CC:	Octavian Purdila <opurdila@...acom.com>,
	Lucian Adrian Grijincu <lgrijincu@...acom.com>,
	netdev@...r.kernel.org
Subject: Re: [RFC] [PATCH] udp: optimize lookup of UDP sockets to by including
 destination address in the hash key

Andi Kleen a écrit :
>> I assume cache is cold or even on other cpu (worst case), dealing with
>> 100.000+ sockets or so...
> 
> Other CPU cache hit is actually typically significantly 
> faster than a DRAM access (unless you're talking about a very large NUMA 
> system and a remote CPU far away)

Even if data is dirty in remote CPU cache ? 

I dont speak of shared data. (if data is shared, workload mostly fits caches)

>> If workload fits in one CPU cache/registers, we dont mind taking one
>> or two cache lines per object, obviously.
> 
> It's more like part of your workload needs to fit.
> 
> For example if you use a tree and the higher levels fit into
> the cache, having a few levels in the tree is (approximately) free.
> 
> That's why I'm not always fond of large hash tables. They pretty
> much guarantee a lot of cache misses under high load, because
> they have little locality.

We already had this discussion Andi, and you know some servers handle 1.000.000+
sockets, 100.000+ frames per second on XX.XXX different flows, and a binary tree
means 20 accesses before target. Only 5 or 6 first levels are in cache.
Machine is barely usable.

hash table with 2.000.000 slots gives one or two accesses before target,
and rcu is trivial with hash tables.

btree are ok for generalist workloads, and rcu is more complex.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ