linux-kernel - Re: [RANDOM] Move two variables to read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20071216173416.GW19691@waste.org>
Date:	Sun, 16 Dec 2007 11:34:17 -0600
From:	Matt Mackall <mpm@...enic.com>
To:	Eric Dumazet <dada1@...mosbay.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Linux kernel <linux-kernel@...r.kernel.org>,
	Adrian Bunk <bunk@...nel.org>
Subject: Re: [RANDOM] Move two variables to read_mostly section to save memory

On Sun, Dec 16, 2007 at 12:45:01PM +0100, Eric Dumazet wrote:
> While examining vmlinux namelist on i686, I noticed :
> 
> c0581300 D random_table
> c0581480 d input_pool
> c0581580 d random_read_wakeup_thresh
> c0581584 d random_write_wakeup_thresh
> c0581600 d blocking_pool
> 
> That means that the two integers random_read_wakeup_thresh and 
> random_write_wakeup_thresh use a full cache entry (128 bytes).

But why did that happen?

Probably because of this:

        spinlock_t lock ____cacheline_aligned_in_smp;

in struct entropy_store.

And that comes from here:

http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commitdiff;h=47b54fbff358a1d5ee4738cec8a53a08bead72e4;hp=ce334bb8f0f084112dcfe96214cacfa0afba7e10

So we could save more memory by just dropping that alignment.

The trick is to improve the scalability without it. Currently, for
every 10 bytes read, we hash the whole output pool and do three
feedback cycles, each grabbing the lock briefly and releasing it. We
also need to grab the lock every 128 bytes to do some accounting. So
we do 40 locks every 128 output bytes! Also, the output pool itself
gets bounced back and forth like mad too.

I'm actually in the middle of redoing some patches that will reduce
this to one lock per 10 bytes, or 14 locks per 128 bytes. 

But we can't do much better than that without some fairly serious
restructuring. Like switching to SHA-512, which would take us to one
lock for every 32 output bytes, or 5 locks per 128 bytes with accounting.

We could also switch to per-cpu output pools for /dev/urandom, which
would add 128 bytes of data per CPU, but would eliminate the lock
contention and the pool cacheline bouncing. Is it worth the added
complexity? Probably not.

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/