lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20071216173416.GW19691@waste.org>
Date:	Sun, 16 Dec 2007 11:34:17 -0600
From:	Matt Mackall <mpm@...enic.com>
To:	Eric Dumazet <dada1@...mosbay.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Linux kernel <linux-kernel@...r.kernel.org>,
	Adrian Bunk <bunk@...nel.org>
Subject: Re: [RANDOM] Move two variables to read_mostly section to save memory

On Sun, Dec 16, 2007 at 12:45:01PM +0100, Eric Dumazet wrote:
> While examining vmlinux namelist on i686, I noticed :
> 
> c0581300 D random_table
> c0581480 d input_pool
> c0581580 d random_read_wakeup_thresh
> c0581584 d random_write_wakeup_thresh
> c0581600 d blocking_pool
> 
> That means that the two integers random_read_wakeup_thresh and 
> random_write_wakeup_thresh use a full cache entry (128 bytes).

But why did that happen?

Probably because of this:

        spinlock_t lock ____cacheline_aligned_in_smp;

in struct entropy_store.

And that comes from here:

http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commitdiff;h=47b54fbff358a1d5ee4738cec8a53a08bead72e4;hp=ce334bb8f0f084112dcfe96214cacfa0afba7e10

So we could save more memory by just dropping that alignment.

The trick is to improve the scalability without it. Currently, for
every 10 bytes read, we hash the whole output pool and do three
feedback cycles, each grabbing the lock briefly and releasing it. We
also need to grab the lock every 128 bytes to do some accounting. So
we do 40 locks every 128 output bytes! Also, the output pool itself
gets bounced back and forth like mad too.

I'm actually in the middle of redoing some patches that will reduce
this to one lock per 10 bytes, or 14 locks per 128 bytes. 

But we can't do much better than that without some fairly serious
restructuring. Like switching to SHA-512, which would take us to one
lock for every 32 output bytes, or 5 locks per 128 bytes with accounting.

We could also switch to per-cpu output pools for /dev/urandom, which
would add 128 bytes of data per CPU, but would eliminate the lock
contention and the pool cacheline bouncing. Is it worth the added
complexity? Probably not.

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ