netdev - Re: [PATCH] HTB O(1) class lookup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200702051921.20146.simonl@parknet.dk>
Date:	Mon, 5 Feb 2007 19:21:19 +0100
From:	Simon Lodal <simonl@...knet.dk>
To:	Andi Kleen <ak@...e.de>
Cc:	Patrick McHardy <kaber@...sh.net>, netdev@...r.kernel.org,
	lartc@...lman.ds9a.nl
Subject: Re: [PATCH] HTB O(1) class lookup

On Thursday 01 February 2007 12:30, Andi Kleen wrote:
> Simon Lodal <simonl@...knet.dk> writes:
> > Memory is generally not an issue, but CPU is, and you can not beat the
> > CPU efficiency of plain array lookup (always faster, and constant time).
>
> Actually that's not true when the array doesn't fit in cache.
>
> The cost of going out to memory over caches is so large (factor 100 and
> more) that often algorithms with smaller cache footprint can easily beat
> algorithms that execute much less instructions if it has less cache misses.
> That is because not all instructions have the same cost; anything
> in core is very fast but going out to memory is very slow.
>
> That said I think I agree with your analysis that a two level
> array is probably the right data structure for this and likely
> not less cache efficient than the hash table.

Good point.

The 2-level lookup generates 3 memory accesses (including getting at the 
htb_class struct). List traversal will generate many more memory accesses, 
unless the lists have 3 or fewer entries (currently that only holds true for 
up to 48 classes, uniformly distributed).

It is difficult to judge if the tables will be in cache or not. The tables are 
clearly extra baggage for the cachelines, compared to only having the 
htb_class structs (they are going to be fetched anyway).

On the other hand, if you have 10k classes, they are usually (real world) 
allocated for individual users, of which at most half are active at any time. 
With hashing, all 10k classes are fetched into cachelines all the time, only 
in order to traverse lists. That is >150k wasted cache (5000 x 32 bytes); 
plenty for 10k pointers in lookup tables.

Regards
Simon
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html