[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YWEK1iNVXcJmDQVP@boqun-archlinux>
Date: Sat, 9 Oct 2021 11:21:58 +0800
From: Boqun Feng <boqun.feng@...il.com>
To: Lai Jiangshan <jiangshanlai@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>,
"Paul E . McKenney" <paulmck@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Frederic Weisbecker <frederic@...nel.org>
Subject: Re: [RFC 2/2] workqueue: Fix work re-entrance when requeue to a
different workqueue
On Sat, Oct 09, 2021 at 10:06:23AM +0800, Lai Jiangshan wrote:
> On Fri, Oct 8, 2021 at 6:06 PM Boqun Feng <boqun.feng@...il.com> wrote:
> >
> > When requeuing a work to a different workqueue while it's still getting
> > processed, re-entrace as the follow can happen:
> >
> > { both WQ1 and WQ2 are bounded workqueue, and a work W has been
> > queued on CPU0 for WQ1}
> >
> > CPU 0 CPU 1
> > ===== ====
> > <In worker on CPU 0>
> > process_one_work():
> > ...
> > // pick up W
> > worker->current_work = W;
> > worker->current_func = W->func;
> > ...
> > set_work_pool_and_clear_pending(...);
> > // W can be requeued afterwards
> > queue_work_on(1, WQ2, W):
> > if (!test_and_set_bit(...)) {
> > // this branch is taken, as CPU 0
> > // just clears pending bit.
> > __queue_work(...):
> > pwq = <pool for CPU1 of WQ2>;
> > last_pool = <pool for CPU 0 of WQ1>;
> > if (last_pool != pwq->pool) { // true
> > if (.. && worker->current_pwq->wq == wq) {
> > // false, since @worker is a
> > // a worker of @last_pool (for
> > // WQ1), and @wq is WQ2.
> > }
> > ...
> > insert_work(pwq, W, ...);
> > }
> > // W queued.
> > <schedule to worker on CPU 1>
> > process_one_work():
> > collision = find_worker_executing_work(..);
> > // NULL, because we're searching the
> > // worker pool of CPU 1, while W is
> > // the current work on worker pool of
> > // CPU 0.
> > worker->current_work = W;
> > worker->current_func = W->func;
> > worker->current_func(...);
> > ...
> > worker->current_func(...); // Re-entrance
>
> Concurrent or parallel executions on the same work item aren't
> considered as "Re-entrance" if the workqueue is changed.
>
Well, then Documentation/core-api/workqueue.rst can use some help:
"Note that the flag ``WQ_NON_REENTRANT`` no longer exists as all
workqueues are now non-reentrant - any work item is guaranteed to be
executed by at most one worker system-wide at any given time."
Clearly in the above case that a work item is executed by two worker at
the same time.
> It allows the work function to free itself(the item) and another
> subsystem allocates the same item and reuses it.
>
So you're saying in process_one_work(), ->current_work can point to a
work which gets freed and reallocated before the worker actually
execute it? And users should guarantee it's safe to do so? I mean this
is something that workqueue subsystem allows/expects users to do?
> "Re-entrance" is defined as:
> work function has not been changed
> wq has not been changed
> the item has not been reinitiated.
> (To reduce the check complication, the workqueue subsystem often
> considers it "Re-entrance" if the condition is changed and has changed
> back. But the wq users should not depend on this behavior and should avoid
> it)
>
Thanks for clarifiction, could you also update the documentation to
avoid future confusion? Thanks!
Regards,
Boqun
>
> >
> > This issue is already partially fixed because in queue_work_on(),
> > last_pool can be used to queue the work, as a result the requeued work
> > processing will find the collision and wait for the existing one to
> > finish. However, currently the last_pool is only used when two
> > workqueues are the same one, which causes the issue. Therefore extend
> > the behavior to allow last_pool to requeue the work W even if the
> > workqueues are different. It's safe to do this since the work W has been
> > proved safe to queue and run on the last_pool.
> >
> > Signed-off-by: Boqun Feng <boqun.feng@...il.com>
> > ---
> > kernel/workqueue.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 1418710bffcd..410141cc5f88 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -1465,7 +1465,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
> >
> > worker = find_worker_executing_work(last_pool, work);
> >
> > - if (worker && worker->current_pwq->wq == wq) {
> > + if (worker) {
> > pwq = worker->current_pwq;
> > } else {
> > /* meh... not running there, queue here */
> > --
> > 2.32.0
> >
Powered by blists - more mailing lists