linux-kernel - Re: [RANDOM] Move two variables to read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sun, 16 Dec 2007 18:56:55 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Matt Mackall <mpm@...enic.com>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Linux kernel <linux-kernel@...r.kernel.org>,
	Adrian Bunk <bunk@...nel.org>
Subject: Re: [RANDOM] Move two variables to read_mostly section to save memory

Matt Mackall a écrit :
> On Sun, Dec 16, 2007 at 12:45:01PM +0100, Eric Dumazet wrote:
>> While examining vmlinux namelist on i686, I noticed :
>>
>> c0581300 D random_table
>> c0581480 d input_pool
>> c0581580 d random_read_wakeup_thresh
>> c0581584 d random_write_wakeup_thresh
>> c0581600 d blocking_pool
>>
>> That means that the two integers random_read_wakeup_thresh and 
>> random_write_wakeup_thresh use a full cache entry (128 bytes).
> 
> But why did that happen?
> 
> Probably because of this:
> 
>         spinlock_t lock ____cacheline_aligned_in_smp;
> 
> in struct entropy_store.
> 
> And that comes from here:
> 
> http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commitdiff;h=47b54fbff358a1d5ee4738cec8a53a08bead72e4;hp=ce334bb8f0f084112dcfe96214cacfa0afba7e10
> 
> So we could save more memory by just dropping that alignment.
> 
> The trick is to improve the scalability without it. Currently, for
> every 10 bytes read, we hash the whole output pool and do three
> feedback cycles, each grabbing the lock briefly and releasing it. We
> also need to grab the lock every 128 bytes to do some accounting. So
> we do 40 locks every 128 output bytes! Also, the output pool itself
> gets bounced back and forth like mad too.
> 
> I'm actually in the middle of redoing some patches that will reduce
> this to one lock per 10 bytes, or 14 locks per 128 bytes. 
> 
> But we can't do much better than that without some fairly serious
> restructuring. Like switching to SHA-512, which would take us to one
> lock for every 32 output bytes, or 5 locks per 128 bytes with accounting.
> 
> We could also switch to per-cpu output pools for /dev/urandom, which
> would add 128 bytes of data per CPU, but would eliminate the lock
> contention and the pool cacheline bouncing. Is it worth the added
> complexity? Probably not.
> 

Yes, this reminds me the prefetch_range(r->pool, wordmask); is wrong (should 
be prefetch_range(r->pool, wordmask*4) , so I am not sure how it could help 
David to get better performance.... :(


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/