[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJhGHyCXYhfPX-_TivLP9rs8AYPnF8qBZKgp-4yZ-aJun6DHfg@mail.gmail.com>
Date: Thu, 25 Jul 2024 08:11:34 +0800
From: Lai Jiangshan <jiangshanlai@...il.com>
To: Marc Hartmayer <mhartmay@...ux.ibm.com>
Cc: linux-kernel@...r.kernel.org, Lai Jiangshan <jiangshan.ljs@...group.com>,
Valentin Schneider <vschneid@...hat.com>, Tejun Heo <tj@...nel.org>, Heiko Carstens <hca@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Mete Durlu <meted@...ux.ibm.com>
Subject: Re: [PATCH 1/4] workqueue: Reap workers via kthread_stop() and remove detach_completion
Hello Marc
Thank you for the report.
On Wed, Jul 24, 2024 at 12:19 AM Marc Hartmayer <mhartmay@...ux.ibm.com> wrote:
> Hi Lai,
>
> a bisect of a regression in our CI on s390x led to this patch. The bug
> is pretty easy to reproduce (currently, I only tested it on s390x - will
> try to test it on x86 as well):
I can't reproduce it in x86 after testing it for only 30 minutes.
It can definitely theoretically happen in x86.
>
> 1. Start a Linux QEMU/KVM guest with 2 cores using this patch and enable
> `panic_on_warn=1` for the guest kernel.
> 2. Run the following command in the KVM guest:
>
> $ dd if=/dev/zero of=/dev/null & while : ; do chcpu -d 1; chcpu -e 1; done
>
> 3. Wait for the crash. e.g.:
>
> 2024/07/23 18:01:21 [M83LP63]: [ 157.267727] ------------[ cut here ]------------
> 2024/07/23 18:01:21 [M83LP63]: [ 157.267735] WARNING: CPU: 21 PID: 725 at kernel/workqueue.c:3340 worker_thread+0x54e/0x558
> @@ -3330,7 +3338,6 @@ static int worker_thread(void *__worker)
> ida_free(&pool->worker_ida, worker->id);
> worker_detach_from_pool(worker);
> WARN_ON_ONCE(!list_empty(&worker->entry));
> - kfree(worker);
> return 0;
> }
The condition "!list_empty(&worker->entry)" can be true when the
worker is still in the cull_list awaiting being reaped by
reap_dying_workers() after
this change.
I will remove the WARN_ON_ONCE().
Thanks
Lai
Powered by blists - more mailing lists