lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877gz5ze82.fsf@xmission.com>
Date:	Wed, 29 Feb 2012 10:28:13 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	"David Laight" <David.Laight@...LAB.COM>
Cc:	"Eric Dumazet" <eric.dumazet@...il.com>,
	"David Miller" <davem@...emloft.net>,
	<paul.gortmaker@...driver.com>, <tim.bird@...sony.com>,
	<kuznet@....inr.ac.ru>, <linux-kernel@...r.kernel.org>,
	<netdev@...r.kernel.org>
Subject: Re: RFC: memory leak in udp_table_init

"David Laight" <David.Laight@...LAB.COM> writes:

>> It was to match the comment we have few lines above :
>> 
>> /*
>>  * The pid hash table is scaled according to the amount of memory in
> the
>>  * machine.  From a minimum of 16 slots up to 4096 slots at one
> gigabyte or
>>  * more.
>>  */

The comment was actually correct until someone converted the code to use
alloc_large_system_hash.

> These large hash tables are, IMHO, an indication that the
> algorithm used is, perhaps, suboptimal.
>
> Not least of the problems is actually finding a suitable
> (and fast) hash function that will work with the actual
> real-life data.
>
> The pid table is a good example of something where a hash
> table is unnecessary.
> Linux should steal the code I put into NetBSD :-)

On this unrelated topic.  What algorithm did you use on NetBSD for
dealing with pids?

A hash chain length of 1 for common sizes is a pretty attractive
algorithm, as it minimizes the number of cache line misses.  Ages ago
when I modeled the linux situation that is what I was seeing.

The normal case for the linux pid hash table is that it has 4096
entries taking 16k or 32k of memory and the typical process load
has about 200 or so processes.    Making it very easy in the common
case to have a single entry hash chain.

All of the other algorithms I know have a tree structure and thus
more cache misses (as you traverse the tree) and ultimately worse
real world performance.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ