lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 10 Nov 2020 14:54:43 +0100
From:   Marco Elver <elver@...gle.com>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     Anders Roxell <anders.roxell@...aro.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>,
        Alexander Potapenko <glider@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Jann Horn <jannh@...gle.com>,
        Linux Next Mailing List <linux-next@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: linux-next: Tree for Nov 5

On Tue, 10 Nov 2020 at 10:36, Dmitry Vyukov <dvyukov@...gle.com> wrote:
[...]
> > > On Tue, Nov 10, 2020 at 8:50 AM Anders Roxell <anders.roxell@...aro.org> wrote:
[...]
> > > > When building an arm64 allmodconfig and booting up that in qemu I see
> > > >
> > > > [10011.092394][   T28] task:kworker/0:2     state:D stack:26896 pid:
> > > > 1840 ppid:     2 flags:0x00000428
> > > > [10022.368093][   T28] Workqueue: events toggle_allocation_gate
> > > > [10024.827549][   T28] Call trace:
> > > > [10027.152494][   T28]  __switch_to+0x1cc/0x1e0
> > > > [10031.378073][   T28]  __schedule+0x730/0x800
> > > > [10032.164468][   T28]  schedule+0xd8/0x160
> > > > [10033.886807][   T28]  toggle_allocation_gate+0x16c/0x220
> > > > [10038.477987][   T28]  process_one_work+0x5c0/0x980
> > > > [10039.900075][   T28]  worker_thread+0x428/0x720
> > > > [10042.782911][   T28]  kthread+0x23c/0x260
> > > > [10043.171725][   T28]  ret_from_fork+0x10/0x18
> > > > [10046.227741][   T28] INFO: lockdep is turned off.
> > > > [10047.732220][   T28] Kernel panic - not syncing: hung_task: blocked tasks
> > > > [10047.741785][   T28] CPU: 0 PID: 28 Comm: khungtaskd Tainted: G
> > > >   W         5.10.0-rc2-next-20201105-00006-g7af110e4d8ed #1
> > > > [10047.755348][   T28] Hardware name: linux,dummy-virt (DT)
> > > > [10047.763476][   T28] Call trace:
> > > > [10047.769802][   T28]  dump_backtrace+0x0/0x420
> > > > [10047.777104][   T28]  show_stack+0x38/0xa0
> > > > [10047.784177][   T28]  dump_stack+0x1d4/0x278
> > > > [10047.791362][   T28]  panic+0x304/0x5d8
> > > > [10047.798202][   T28]  check_hung_uninterruptible_tasks+0x5e4/0x640
> > > > [10047.807056][   T28]  watchdog+0x138/0x160
> > > > [10047.814140][   T28]  kthread+0x23c/0x260
> > > > [10047.821130][   T28]  ret_from_fork+0x10/0x18
> > > > [10047.829181][   T28] Kernel Offset: disabled
> > > > [10047.836274][   T28] CPU features: 0x0240002,20002004
> > > > [10047.844070][   T28] Memory Limit: none
> > > > [10047.853599][   T28] ---[ end Kernel panic - not syncing: hung_task:
> > > > blocked tasks ]---
> > > >
> > > > if I build with KFENCE=n it boots up eventually, here's my .config file [2].
> > > >
> > > > Any idea what may happen?
> > > >
> > > > it happens on next-20201109 also, but it takes longer until we get the
> > > > "Call trace:".
> > > >
> > > > Cheers,
> > > > Anders
> > > > [1] http://ix.io/2Ddv
> > > > [2] https://people.linaro.org/~anders.roxell/allmodconfig-next-20201105.config
[...]
> > oh I missed to say that this is the full boot log with the kernel
> > panic http://ix.io/2Ddv
>
> Thanks!
> The last messages before the hang are:
>
> [ 1367.791522][    T1] Running tests on all trace events:
> [ 1367.815307][    T1] Testing all events:
>
> I can imagine tracing somehow interferes with kfence.

The reason is simply that that config on qemu is so slow (enabling
lockdep helped), and the test that is running doesn't result in
allocations for an extended time. Because of that our wait_event()
just stalls, as there are no allocations coming in. My guess is that
this scenario is unique to early boot, where we are not yet running
user space, paired with running a selftest that results in no
allocations for some time.

Try and give that a spin:
https://lkml.kernel.org/r/20201110135320.3309507-1-elver@google.com

Thanks,
-- Marco

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ