lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 10 Sep 2019 19:33:04 +0200
From:   "Ahmed S. Darwish" <darwish.07@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Theodore Ts'o <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Jan Kara <jack@...e.cz>, Ray Strode <rstrode@...hat.com>,
        William Jon McCann <mccann@....edu>,
        zhangjs <zachary@...shancloud.com>, linux-ext4@...r.kernel.org,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: Linux 5.3-rc8

On Tue, Sep 10, 2019 at 12:33:12PM +0100, Linus Torvalds wrote:
> On Tue, Sep 10, 2019 at 5:21 AM Ahmed S. Darwish <darwish.07@...il.com> wrote:
> >
> > The commit b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), [1]
> > which was merged in v5.3-rc1, *always* leads to a blocked boot on my
> > system due to low entropy.
>
> Exactly what is it that blocks on entropy? Nobody should do that
> during boot, because on some systems entropy is really really low
> (think flash memory with polling IO etc).
>

Ok, I've tracked it down further. It's unfortunately GDM
intentionally blocking on a getrandom(buf, 16, 0).

Booting the system with an straced GDM service
("ExecStart=strace -f /usr/bin/gdm") reveals:

  ...
  [  3.779375] strace[262]: [pid   323] execve("/usr/lib/gnome-session-binary",
                                                 ... /* 28 vars */) = 0
  ...
  [  4.019227] strace[262]: [pid   323] getrandom( <unfinished ...>
  [ 79.601433] kernel: random: crng init done
  [ 79.601443] kernel: random: 3 urandom warning(s) missed due to ratelimiting
  [ 79.601262] strace[262]: [pid   323] <... getrandom resumed>..., 16, 0) = 16
  [ 79.601262] strace[262]: [pid   323] getrandom(..., 16, 0) = 16
  [ 79.603041] strace[262]: [pid   323] getrandom(..., 16, 0) = 16
  [ 79.603041] strace[262]: [pid   323] getrandom(..., 16, 0) = 16
  [ 79.603041] strace[262]: [pid   323] getrandom(..., 16, 0) = 16

As can be seen in the timestamps, the GDM boot was only continued
by typing randomly on the keyboard..

> That said, I would have expected that any PC gets plenty of entropy.
> Are you sure it's entropy that is blocking, and not perhaps some odd
> "forgot to unplug" situation?
>

Yes, doing any of below steps makes the problem reliably disappear:

  - boot param "random.trust_cpu=on"
  - rngd(8) enabled at boot (entropy source: x86 RDRAND + jitter)
  - pressing random 3 or 4 keyboard keys while GDM boot is stuck

> > Can this even be considered a user-space breakage? I'm honestly not
> > sure. On my modern RDRAND-capable x86, just running rng-tools rngd(8)
> > early-on fixes the problem. I'm not sure about the status of older
> > CPUs though.
>
> It's definitely breakage, although rather odd. I would have expected
> us to have other sources of entropy than just the disk. Did we stop
> doing low bits of TSC from timer interrupts etc?
>

Exactly.

While gnome-session is obviously at fault here by requiring
*blocking* randomness at the boot path, it's still not requesting
much, just (5 * 16) bytes to be exact.

I guess an x86 laptop should be able to provide that, even without
RDRAND / random.trust_cpu=on (TSC jitter, etc.) ?

thanks,
--darwi

> Ted, either way - ext4 IO patterns or random number entropy - this is
> your code. Comments?
>
>                  Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ