lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YuRZ4tC+GY+hymFd@zx2c4.com>
Date:   Sat, 30 Jul 2022 00:06:26 +0200
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Florian Weimer <fweimer@...hat.com>
Cc:     linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org,
        x86@...nel.org, Nadia Heninger <nadiah@...ucsd.edu>,
        Thomas Ristenpart <ristenpart@...nell.edu>,
        Theodore Ts'o <tytso@....edu>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>
Subject: Re: [PATCH RFC v1] random: implement getrandom() in vDSO

Hey Florian,

Thanks for the feedback.

On Fri, Jul 29, 2022 at 10:19:05PM +0200, Florian Weimer wrote:
> > +	if (getcpu(&start, NULL, NULL) == 0)
> > +		start %= NUM_BUCKETS;
> 
> getcpu is not available everywhere.  Userspace/libc should probably
> provide a CPU number hint as an additional argument during the vDSO
> call.  We can load that easily enough from rseq.  That's going to be
> faster on x86, too (the LSL instruction is quite slow).  The only
> advantage of using getcpu like this is that it's compatible with a libc
> that isn't rseq-enabled.

Actually, the only requirement is that it's somewhat stable and somehow
separates threads most of the time. So a per-thread ID or even a
per-thread address would work fine too. Adhemerval suggested on IRC this
afternoon that there's a thread pointer register value that would be
usable for this purpose. I think what I'll do for v2 is abstract this
out to a __arch_get_bucket_hint() function, or similar, which the
different archs can fill in.

> > +	for (i = start;;) {
> > +		struct getrandom_state *state = &buckets[i];
> > +
> > +		if (cmpxchg(&state->in_use, false, true) == false)
> > +			return state;
> > +
> > +		i = i == NUM_BUCKETS - 1 ? 0 : i + 1;
> > +		if (i == start)
> > +			break;
> > +	}
> 
> Surely this scales very badly once the number of buckets is smaller than
> the system processor count?

Right, and there are a few ways that observation can go:

1) It doesn't matter, because who has > 28 threads all churning at once
   here? Is that something real?

2) The state variable is controllable by userspace, so in theory
   different ones could be passed. I don't like this idea though - hard
   to manage and not enough information to do it well.

3) Since we know when this kind of contention is hit, it should be
   possible to just expand the map size. Seems a bit complicated.

4) Simply allocate a number of pages relative to the number of CPUs, so
   that this isn't actually a problem. This seems like the simplest
   approach and seems fine.

Jason 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ