linux-kernel - Re: [PATCH 14/24] workqueue: Generalize unbound CPU pods

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZIJbMQOu_k07jkFf@slm.duckdns.org>
Date:   Thu, 8 Jun 2023 12:50:25 -1000
From:   Tejun Heo <tj@...nel.org>
To:     K Prateek Nayak <kprateek.nayak@....com>
Cc:     Sandeep Dhavale <dhavale@...gle.com>, jiangshanlai@...il.com,
        torvalds@...ux-foundation.org, peterz@...radead.org,
        linux-kernel@...r.kernel.org, kernel-team@...a.com,
        joshdon@...gle.com, brho@...gle.com, briannorris@...omium.org,
        nhuck@...gle.com, agk@...hat.com, snitzer@...nel.org,
        void@...ifault.com, kernel-team@...roid.com
Subject: Re: [PATCH 14/24] workqueue: Generalize unbound CPU pods

Hello,

On Thu, Jun 08, 2023 at 08:31:34AM +0530, K Prateek Nayak wrote:
...
> Thank you for sharing the debug branch. I've managed to hit some one of
> the WARN_ON_ONCE() consistently but I still haven't seen a kernel panic
> yet. Sharing the traces below:

Yeah, that's good. It does a dirty fix-up. Shouldn't crash.

> o Early Boot
> 
>     [    4.182411] ------------[ cut here ]------------
>     [    4.186313] WARNING: CPU: 0 PID: 1 at kernel/workqueue.c:1130 kick_pool+0xdb/0xe0
>     [    4.186313] Modules linked in:
>     [    4.186313] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.4.0-rc1-tj-wq-valid-cpu+ #481
>     [    4.186313] Hardware name: Dell Inc. PowerEdge R6525/024PW1, BIOS 2.7.3 03/30/2022
>     [    4.186313] RIP: 0010:kick_pool+0xdb/0xe0
>     [    4.186313] Code: 6b c0 d0 01 73 24 41 89 45 64 49 8b 54 24 f8 48 89 d0 30 c0 83 e2 04 ba 00 00 00 00 48 0f 44 c2 48 83 80 c0 00 00 00 01 eb 82 <0f> 0b eb dc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f
>     [    4.186313] RSP: 0018:ffffbc1b800e7dd8 EFLAGS: 00010046
>     [    4.186313] RAX: 0000000000000100 RBX: ffff97c73d2321c0 RCX: 0000000000000000
>     [    4.186313] RDX: 0000000000000040 RSI: 0000000000000001 RDI: ffff9788c0159728
>     [    4.186313] RBP: ffffbc1b800e7df0 R08: 0000000000000100 R09: ffff9788c01593e0
>     [    4.186313] R10: ffff9788c01593c0 R11: 0000000000000001 R12: ffffffff8c582430
>     [    4.186313] R13: ffff9788c03fcd40 R14: 0000000000000000 R15: ffff97c73d2324b0
>     [    4.186313] FS:  0000000000000000(0000) GS:ffff97c73d200000(0000) knlGS:0000000000000000
>     [    4.186313] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     [    4.186313] CR2: ffff97cecee01000 CR3: 000000470d43a001 CR4: 0000000000770ef0
>     [    4.186313] PKRU: 55555554
>     [    4.186313] Call Trace:
>     [    4.186313]  <TASK>
>     [    4.186313]  create_worker+0x14e/0x280
>     [    4.186313]  ? wake_up_process+0x15/0x20
>     [    4.186313]  workqueue_init+0x22a/0x3d0
>     [    4.186313]  kernel_init_freeable+0x1fe/0x4f0
>     [    4.186313]  ? __pfx_kernel_init+0x10/0x10
>     [    4.186313]  kernel_init+0x1b/0x1f0
>     [    4.186313]  ? __pfx_kernel_init+0x10/0x10
>     [    4.186313]  ret_from_fork+0x2c/0x50
>     [    4.186313]  </TASK>
>     [    4.186313] ---[ end trace 0000000000000000 ]---
> 
> o I consistently see a WARN_ON_ONCE() in kick_pool() being hit when I
>   run "sudo ./stress-ng --iomix 96 --timeout 1m". I've seen few
>   different stack traces so far. Including all below just in case:
...
> This is the same WARN_ON_ONCE() you had added in the HEAD commit:
> 
>     $ scripts/faddr2line vmlinux kick_pool+0xdb
>     kick_pool+0xdb/0xe0:
>     kick_pool at kernel/workqueue.c:1130 (discriminator 1)
> 
>     $ sed -n 1130,1132p kernel/workqueue.c
>     if (!WARN_ON_ONCE(wake_cpu >= nr_cpu_ids))
>         p->wake_cpu = wake_cpu;
>     get_work_pwq(work)->stats[PWQ_STAT_REPATRIATED]++;
> 
> Let me know if you need any more data from my test setup.
> P.S. The kernel is still up and running (~30min) despite hitting this
> WARN_ON_ONCE() in my case :)

Okay, that was me being stupid and not initializing the new fields for
per-cpu workqueues. Can you please test the following branch? It should have
both bugs fixed properly.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git affinity-scopes-v2

If that doesn't crash, I'd love to hear how it affects the perf regressions
reported over that past few months.

Thanks.

-- 
tejun