lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 16 Mar 2023 08:58:20 +0100
From:   Ard Biesheuvel <ardb@...nel.org>
To:     Andrea Righi <andrea.righi@...onical.com>
Cc:     "Jason A. Donenfeld" <Jason@...c4.com>,
        Paolo Pisati <paolo.pisati@...onical.com>,
        linux-efi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: kernel 6.2 stuck at boot (efi_call_rts) on arm64

Hello Andrea,

On Thu, 16 Mar 2023 at 08:54, Andrea Righi <andrea.righi@...onical.com> wrote:
>
> Hello,
>
> the latest v6.2.6 kernel fails to boot on some arm64 systems, the kernel
> gets stuck and never completes the boot. On the console I see this:
>
> [   72.043484] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [   72.049571] rcu:     22-...0: (30 GPs behind) idle=b10c/1/0x4000000000000000 softirq=164/164 fqs=6443
> [   72.058520]     (detected by 28, t=15005 jiffies, g=449, q=174 ncpus=32)
> [   72.064949] Task dump for CPU 22:
> [   72.068251] task:kworker/u64:5   state:R  running task     stack:0     pid:447   ppid:2      flags:0x0000000a
> [   72.078156] Workqueue: efi_rts_wq efi_call_rts
> [   72.082595] Call trace:
> [   72.085029]  __switch_to+0xbc/0x100
> [   72.088508]  0xffff80000fe83d4c
>
> After that, as a consequence, I start to get a lot of hung task timeout traces.
>
> I tried to bisect the problem and I found that the offending commit is
> this one:
>
>  e7b813b32a42 ("efi: random: refresh non-volatile random seed when RNG is initialized")
>
> I've reverted this commit for now and everything works just fine, but I
> was wondering if the problem could be caused by a lack of entropy on
> these arm64 boxes or something else.
>
> Any suggestion? Let me know if you want me to do any specific test.
>

Thanks for the report.

This is most likely the EFI SetVariable() call going off into the
weeds and never returning.

Is this an Ampere Altra system by any chance? Do you see it on
different types of hardware?

Could you check whether SetVariable works on this system? E.g. by
updating the EFI boot timeout (sudo efibootmgr -t <n>)?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ