lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YraWWl+Go17uPOgR@mtj.duckdns.org>
Date:   Sat, 25 Jun 2022 14:00:10 +0900
From:   Tejun Heo <tj@...nel.org>
To:     Petr Mladek <pmladek@...e.com>
Cc:     Lai Jiangshan <jiangshanlai@...il.com>,
        Michal Hocko <mhocko@...e.com>, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Oleg Nesterov <oleg@...hat.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>
Subject: re. Spurious wakeup on a newly created kthread

Hello,

cc'ing random assortment of ppl who touched kernel/kthread.c and
others who would know better.

So, Petr debugged a NULL deref in workqueue code to a spurious wakeup
on a newly created kthread. The abbreviated patch description follows.
The original message is at

  http://lkml.kernel.org/r/20220622140853.31383-1-pmladek@suse.com

On Wed, Jun 22, 2022 at 04:08:53PM +0200, Petr Mladek wrote:
> A system crashed with the following BUG() report:
> 
>   [115147.050484] BUG: kernel NULL pointer dereference, address: 0000000000000000
...
>   [115147.050524] Call Trace:
>   [115147.050533]  worker_thread+0xb4/0x3c0
>   [115147.050540]  kthread+0x152/0x170
>   [115147.050544]  ret_from_fork+0x35/0x40
> 
> Further debugging shown that the worker thread was woken
> before worker_attach_to_pool() finished in create_worker().
> 
> Any kthread is supposed to stay in TASK_UNINTERRUPTIBLE sleep
> until it is explicitly woken. But a spurious wakeup might
> break this expectation.
> 
> As a result, worker_thread() might read worker->pool before
> it was set in worker create_worker() by worker_attach_to_pool().
> Also manage_workers() might want to create yet another worker
> before worker->pool->nr_workers is updated. It is a kind off
> a chicken & egg problem.

tl;dr is that the worker creation code expects a newly created worker
kthread to sit tight until the creator finishes setting up stuff and
sends the initial wakeup. However, something, which wasn't identified
in the report (Petr, it'd be great if you can find out who did the
wakeup), wakes up the new kthread before the creation path is done
with init which causes the new kthread to try to deref a NULL pointer.

Petr fixed the problem by adding an extra handshake step so that the
new kthread explicitly waits for the creation path, which is fine, but
the picture isn't making sense to me.

* Are spurious wakeups allowed? The way that we do set_current_state()
  in every iteration in wait_event() seems to suggest that we expect
  someone to spuriously flip task state to RUNNING.

* However, if we're to expect spurious wakeups for anybody anytime,
  why does a newly created kthread bother with
  schedule_preempt_disabled() in kernel/kthread.c::kthread() at all?
  It can't guarantee anything and all it does is masking subtle bugs.

What am I missing here?

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ