lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200702051921.20146.simonl@parknet.dk>
Date:	Mon, 5 Feb 2007 19:21:19 +0100
From:	Simon Lodal <simonl@...knet.dk>
To:	Andi Kleen <ak@...e.de>
Cc:	Patrick McHardy <kaber@...sh.net>, netdev@...r.kernel.org,
	lartc@...lman.ds9a.nl
Subject: Re: [PATCH] HTB O(1) class lookup

On Thursday 01 February 2007 12:30, Andi Kleen wrote:
> Simon Lodal <simonl@...knet.dk> writes:
> > Memory is generally not an issue, but CPU is, and you can not beat the
> > CPU efficiency of plain array lookup (always faster, and constant time).
>
> Actually that's not true when the array doesn't fit in cache.
>
> The cost of going out to memory over caches is so large (factor 100 and
> more) that often algorithms with smaller cache footprint can easily beat
> algorithms that execute much less instructions if it has less cache misses.
> That is because not all instructions have the same cost; anything
> in core is very fast but going out to memory is very slow.
>
> That said I think I agree with your analysis that a two level
> array is probably the right data structure for this and likely
> not less cache efficient than the hash table.

Good point.

The 2-level lookup generates 3 memory accesses (including getting at the 
htb_class struct). List traversal will generate many more memory accesses, 
unless the lists have 3 or fewer entries (currently that only holds true for 
up to 48 classes, uniformly distributed).

It is difficult to judge if the tables will be in cache or not. The tables are 
clearly extra baggage for the cachelines, compared to only having the 
htb_class structs (they are going to be fetched anyway).

On the other hand, if you have 10k classes, they are usually (real world) 
allocated for individual users, of which at most half are active at any time. 
With hashing, all 10k classes are fetched into cachelines all the time, only 
in order to traverse lists. That is >150k wasted cache (5000 x 32 bytes); 
plenty for 10k pointers in lookup tables.


Regards
Simon
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ