linux-kernel - Re: RFC: memory leak in udp_table

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <877gz5ze82.fsf@xmission.com>
Date:	Wed, 29 Feb 2012 10:28:13 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	"David Laight" <David.Laight@...LAB.COM>
Cc:	"Eric Dumazet" <eric.dumazet@...il.com>,
	"David Miller" <davem@...emloft.net>,
	<paul.gortmaker@...driver.com>, <tim.bird@...sony.com>,
	<kuznet@....inr.ac.ru>, <linux-kernel@...r.kernel.org>,
	<netdev@...r.kernel.org>
Subject: Re: RFC: memory leak in udp_table_init

"David Laight" <David.Laight@...LAB.COM> writes:

>> It was to match the comment we have few lines above :
>> 
>> /*
>>  * The pid hash table is scaled according to the amount of memory in
> the
>>  * machine.  From a minimum of 16 slots up to 4096 slots at one
> gigabyte or
>>  * more.
>>  */

The comment was actually correct until someone converted the code to use
alloc_large_system_hash.

> These large hash tables are, IMHO, an indication that the
> algorithm used is, perhaps, suboptimal.
>
> Not least of the problems is actually finding a suitable
> (and fast) hash function that will work with the actual
> real-life data.
>
> The pid table is a good example of something where a hash
> table is unnecessary.
> Linux should steal the code I put into NetBSD :-)

On this unrelated topic.  What algorithm did you use on NetBSD for
dealing with pids?

A hash chain length of 1 for common sizes is a pretty attractive
algorithm, as it minimizes the number of cache line misses.  Ages ago
when I modeled the linux situation that is what I was seeing.

The normal case for the linux pid hash table is that it has 4096
entries taking 16k or 32k of memory and the typical process load
has about 200 or so processes.    Making it very easy in the common
case to have a single entry hash chain.

All of the other algorithms I know have a tree structure and thus
more cache misses (as you traverse the tree) and ultimately worse
real world performance.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/