linux-kernel - Re: re. Spurious wakeup on a newly created kthread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YrllOoGRqp+Cdc9/@dhcp22.suse.cz>
Date:   Mon, 27 Jun 2022 10:07:22 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Tejun Heo <tj@...nel.org>
Cc:     Petr Mladek <pmladek@...e.com>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Oleg Nesterov <oleg@...hat.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>
Subject: Re: re. Spurious wakeup on a newly created kthread

On Sat 25-06-22 14:00:10, Tejun Heo wrote:
[...]
> tl;dr is that the worker creation code expects a newly created worker
> kthread to sit tight until the creator finishes setting up stuff and
> sends the initial wakeup. However, something, which wasn't identified
> in the report (Petr, it'd be great if you can find out who did the
> wakeup), wakes up the new kthread before the creation path is done
> with init which causes the new kthread to try to deref a NULL pointer.

One thing that hasn't been really mentioned explicitly but it might be
really important. 5.3.18-2-rt is the RT patchset backported to 5.3
based SLES kernel. We do not know whether the upstream RT patchset is
affected the same way and due to nature of the customer we cannot really
have this tested on that kernel.

I have tried to dig that out from the crash dump but no luck
unfortunately. I was able to find the kthread pointer on few stacks:
crash> search -t ffff9ca77e9a8000
PID: 2      TASK: ffff9ca700381d80  CPU: 0   COMMAND: "kthreadd"
ffffb0f68001fd78: ffff9ca77e9a8000
ffffb0f68001fe38: ffff9ca77e9a8000

PID: 842    TASK: ffff9ca75e6349c0  CPU: 0   COMMAND: "sysmon"
ffffb0f6808e7c50: ffff9ca77e9a8000
ffffb0f6808e7da8: ffff9ca77e9a8000

PID: 2035   TASK: ffff9ca71e252c40  CPU: 1   COMMAND: "vReqJob"
ffffb0f68300fba8: ffff9ca77e9a8000
ffffb0f68300fbf0: ffff9ca77e9a8000

PID: 16127  TASK: ffff9ca77e9abb00  CPU: 0   COMMAND: "kworker/0:0H"
ffffb0f688067dd0: ffff9ca77e9a8000

PID: 16213  TASK: ffff9ca77e9a8000  CPU: 1   COMMAND: "kworker/0:2H"
ffffb0f68822fba0: ffff9ca77e9a8000
ffffb0f68822fca0: ffff9ca77e9a8000
ffffb0f68822fd70: ffff9ca77e9a8000
ffffb0f68822fe38: ffff9ca77e9a8000
ffffb0f68822fef8: ffff9ca77e9a8000

kthreadadd is not all that interesting because that one has created the
thread. sysmon is 
crash> bt ffff9ca75e6349c0
PID: 842    TASK: ffff9ca75e6349c0  CPU: 0   COMMAND: "sysmon"
 #0 [ffffb0f6808e7be8] __schedule at ffffffffaa94bef1
 #1 [ffffb0f6808e7c78] preempt_schedule_common at ffffffffaa94c5d1
 #2 [ffffb0f6808e7c80] ___preempt_schedule at ffffffffaa201bb6
 #3 [ffffb0f6808e7cd8] rt_mutex_postunlock at ffffffffaa309f6b
 #4 [ffffb0f6808e7ce8] rt_mutex_futex_unlock at ffffffffaa94ea7c
 #5 [ffffb0f6808e7d30] rt_spin_unlock at ffffffffaa950ace
 #6 [ffffb0f6808e7d38] proc_pid_status at ffffffffaa4c9ff4
 #7 [ffffb0f6808e7dd8] proc_single_show at ffffffffaa4c3801
 #8 [ffffb0f6808e7e10] seq_read at ffffffffaa47c1b8
 #9 [ffffb0f6808e7e70] vfs_read at ffffffffaa455fd9
#10 [ffffb0f6808e7ea0] ksys_read at ffffffffaa456364

so collecting /pro/status data and releasing the sighand lock.

crash> bt ffff9ca71e252c40
PID: 2035   TASK: ffff9ca71e252c40  CPU: 1   COMMAND: "vReqJob"
 #0 [ffffb0f68300fbb0] __schedule at ffffffffaa94bef1
 #1 [ffffb0f68300fc40] schedule at ffffffffaa94c4e6
 #2 [ffffb0f68300fc50] futex_wait_queue_me at ffffffffaa33e2c0
 #3 [ffffb0f68300fc88] futex_wait at ffffffffaa33f199
 #4 [ffffb0f68300fd98] do_futex at ffffffffaa341062
 #5 [ffffb0f68300fe70] __x64_sys_futex at ffffffffaa341a7e
 #6 [ffffb0f68300fee0] do_syscall_64 at ffffffffaa2025f0

is scheduled into the kthread which is expected because the newly
created kernel thread should be running kthread() where it sleeps before
calling the thread function.

kworker/0:2H is our tkrehad worker.

crash> bt ffff9ca77e9abb00
 #5 [ffffb0f688067c90] delay_tsc at ffffffffaa9479fc
 #6 [ffffb0f688067c90] wait_for_xmitr at ffffffffaa6c1e45
 #7 [ffffb0f688067cb0] serial8250_console_putchar at ffffffffaa6c1ee6
 #8 [ffffb0f688067cc8] uart_console_write at ffffffffaa6bb095
 #9 [ffffb0f688067cf0] serial8250_console_write at ffffffffaa6c5962
#10 [ffffb0f688067d70] console_unlock at ffffffffaa30d795
#11 [ffffb0f688067db0] vprintk_emit at ffffffffaa30fc11
#12 [ffffb0f688067e00] printk at ffffffffaa3103c0
#13 [ffffb0f688067e68] __kthread_bind_mask at ffffffffaa2d74df
#14 [ffffb0f688067e90] create_worker at ffffffffaa2cfc13
#15 [ffffb0f688067ed8] worker_thread at ffffffffaa2d1cbe
#16 [ffffb0f688067f10] kthread at ffffffffaa2d6da2

is the delayed thread which is creating the worker and currently stuck
somewhere in the printk (console) code.

So if somebody has woken up our thread from inside kthread() then it
doesn't have that pointer on the stack and I couldn't it find elsewhere
either. Maybe somebody has an idea where to look at.

Btw. I still haven't caught up with the rest of the email thread and
today will be quite busy for me. Anyway, if somebody has ideas about a
further post mortem debugging then I am more than happy to help out.
-- 
Michal Hocko
SUSE Labs