[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmhedy5ecdg.mognet@vschneid.remote.csb>
Date: Thu, 28 Jul 2022 11:54:19 +0100
From: Valentin Schneider <vschneid@...hat.com>
To: Lai Jiangshan <jiangshanlai@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Frederic Weisbecker <frederic@...nel.org>,
Juri Lelli <juri.lelli@...hat.com>,
Phil Auld <pauld@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>
Subject: Re: [RFC PATCH v2 1/2] workqueue: Unbind workers before sending
them to exit()
On 28/07/22 01:13, Lai Jiangshan wrote:
> Quick review before going to sleep.
>
Thanks!
> On Wed, Jul 27, 2022 at 7:54 PM Valentin Schneider <vschneid@...hat.com> wrote:
>> @@ -1806,8 +1806,10 @@ static void worker_enter_idle(struct worker *worker)
>> /* idle_list is LIFO */
>> list_add(&worker->entry, &pool->idle_list);
>>
>> - if (too_many_workers(pool) && !timer_pending(&pool->idle_timer))
>> - mod_timer(&pool->idle_timer, jiffies + IDLE_WORKER_TIMEOUT);
>> + if (too_many_workers(pool) && !delayed_work_pending(&pool->idle_reaper_work))
>> + mod_delayed_work(system_unbound_wq,
>> + &pool->idle_reaper_work,
>> + IDLE_WORKER_TIMEOUT);
>
> system_unbound_wq doesn't have a rescuer.
>
> A new workqueue with a rescuer needs to be created and used for
> this purpose.
>
Right, I think it makes sense for those work items to be attached to a
WQ_MEM_RECLAIM workqueue. Should I add that as a workqueue-internal
thing?
>>
>> /* Sanity check nr_running. */
>> WARN_ON_ONCE(pool->nr_workers == pool->nr_idle && pool->nr_running);
>> @@ -1972,9 +1974,29 @@ static struct worker *create_worker(struct worker_pool *pool)
>> return NULL;
>> }
>>
>> +static void unbind_worker(struct worker *worker)
>> +{
>> + kthread_set_per_cpu(worker->task, -1);
>> + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, wq_unbound_cpumask) < 0);
>> +}
>> +
>> +static void rebind_worker(struct worker *worker, struct worker_pool *pool)
>> +{
>> + kthread_set_per_cpu(worker->task, pool->cpu);
>> + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask) < 0);
>> +}
>> +
>> +static void reap_worker(struct worker *worker)
>> +{
>> + list_del_init(&worker->entry);
>> + unbind_worker(worker);
>> + wake_up_process(worker->task);
>
>
> Since WORKER_DIE is set, the worker can be possible freed now
> if there is another source to wake it up.
>
My understanding for having reap_worker() be "safe" to use outside of
raw_spin_lock_irq(pool->lock) is that pool->idle_list is never accessed
outside of the pool->lock, and wake_up_worker() only wakes a worker that
is in that list. So with destroy_worker() detaching the worker from
pool->idle_list under pool->lock, I'm not aware of a codepath other than
reap_worker() that could wake it up.
The only wake_up_process() I see that doesn't involve the pool->idle_list
is in send_mayday(), but AFAIA rescuers can never end up in the idle_list
and are specifically destroyed in destroy_workqueue().
> I think reverting a part of the commit 60f5a4bcf852("workqueue:
> async worker destruction") to make use of kthread_stop()
> in destroy_worker() should be a good idea.
Powered by blists - more mailing lists