[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230109133316.4026472-1-vschneid@redhat.com>
Date: Mon, 9 Jan 2023 13:33:12 +0000
From: Valentin Schneider <vschneid@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
Frederic Weisbecker <frederic@...nel.org>,
Juri Lelli <juri.lelli@...hat.com>,
Phil Auld <pauld@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>
Subject: [PATCH v7 0/4] workqueue: destroy_worker() vs isolated CPUs
Hi folks,
New year, new version!
No major changes here, mainly some tidying up from Tejun's comments and a bugfix
spotted by Lai.
Revisions
=========
v6 -> v7
++++++++
o Rebased onto v6.2-rc3
o Dropped work pending check in worker_enter_idle() (Tejun)
o Overall comment cleanup (Tejun)
o put_unbound_pool() locking issue (Lai)
Unfortunately the mutex cannot be acquired from within wq_manager_inactive()
as rcuwait_wait_event() sets the task state to TASK_UNINTERRUPTIBLE before
invoking it, so grabbing the mutex could clobber the task state.
I've gone with dropping the pool->lock and reacquiring the two locks in the
right order after we've become the manager, see comments.
o Applied Lai's RB to patches that just had cosmetic changes
v5 -> v6
++++++++
o Rebase onto v6.1-rc7
o Get rid of worker_pool.idle_cull_list; only do minimal amount of work in the
timer callback (Tejun)
o Dropped the too_many_workers() -> nr_workers_to_cull() change
v4 -> v5
++++++++
o Rebase onto v6.1-rc6
o Overall renaming from "reaping" to "cull"
I somehow convinced myself this was more appropriate
o Split the dwork into timer callback + work item (Tejun)
I didn't want to have redudant operations happen in the timer callback and in
the work item, so I made the timer callback detect which workers are "ripe"
enough and then toss them to a worker for removal.
This however means we release the pool->lock before getting to actually doing
anything to those idle workers, which means they can wake up in the meantime.
The new worker_pool.idle_cull_list is there for that reason.
The alternative was to have the timer callback detect if any worker was ripe
enough, kick the work item if so, and have the work item do the same thing
again, which I didn't like.
RFCv3 -> v4
+++++++++++
o Rebase onto v6.0
o Split into more patches for reviewability
o Take dying workers out of the pool->workers as suggested by Lai
RFCv2 -> RFCv3
++++++++++++++
o Rebase onto v5.19
o Add new patch (1/3) around accessing wq_unbound_cpumask
o Prevent WORKER_DIE workers for kfree()'ing themselves before the idle reaper
gets to handle them (Tejun)
Bit of an aside on that: I've been struggling to convince myself this can
happen due to spurious wakeups and would like some help here.
Idle workers are TASK_UNINTERRUPTIBLE, so they can't be woken up by
signals. That state is set *under* pool->lock, and all wakeups (before this
patch) are also done while holding pool->lock.
wake_up_worker() is done under pool->lock AND only wakes a worker on the
pool->idle_list. Thus the to-be-woken worker *cannot* have WORKER_DIE, though
it could gain it *after* being woken but *before* it runs, e.g.:
LOCK pool->lock
wake_up_worker(pool)
wake_up_process(p)
UNLOCK pool->lock
idle_reaper_fn()
LOCK pool->lock
destroy_worker(worker, list);
UNLOCK pool->lock
worker_thread()
goto woke_up;
LOCK pool->lock
READ worker->flags & WORKER_DIE
UNLOCK pool->lock
...
kfree(worker);
reap_worker(worker);
// Uh-oh
... But IMO that's not a spurious wakeup, that's a concurrency issue. I don't
see any spurious/unexpected worker wakeup happening once a worker is off the
pool->idle_list.
RFCv1 -> RFCv2
++++++++++++++
o Change the pool->timer into a delayed_work to have a sleepable context for
unbinding kworkers
Cheers,
Valentin
Lai Jiangshan (1):
workqueue: Protects wq_unbound_cpumask with wq_pool_attach_mutex
Valentin Schneider (3):
workqueue: Factorize unbind/rebind_workers() logic
workqueue: Convert the idle_timer to a timer + work_struct
workqueue: Unbind kworkers before sending them to exit()
kernel/workqueue.c | 205 ++++++++++++++++++++++++++++++++++-----------
1 file changed, 154 insertions(+), 51 deletions(-)
--
2.31.1
Powered by blists - more mailing lists