[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJhGHyBR6up3o9Svxn=uL2a0rRcu-q3BR8TgdpLykR6iTZ3Aew@mail.gmail.com>
Date: Tue, 20 Feb 2024 15:33:34 +0800
From: Lai Jiangshan <jiangshanlai@...il.com>
To: Tejun Heo <tj@...nel.org>
Cc: torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
allen.lkml@...il.com, kernel-team@...a.com
Subject: Re: [PATCH 16/17] workqueue: Allow cancel_work_sync() and
disable_work() from atomic contexts on BH work items
Hello, Tejun
On Sat, Feb 17, 2024 at 2:06 AM Tejun Heo <tj@...nel.org> wrote:
> @@ -4072,7 +4070,32 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
> if (!pool)
> return false;
>
> - wait_for_completion(&barr.done);
> + if ((pool->flags & POOL_BH) && from_cancel) {
pool pointer might be invalid here, please check POOL_BH before
rcu_read_unlock()
or move rcu_read_unlock() here, or use "*work_data_bits(work) & WORK_OFFQ_BH".
> + /*
> + * We're flushing a BH work item which is being canceled. It
> + * must have been executing during start_flush_work() and can't
> + * currently be queued. If @work is still executing, we know it
> + * is running in the BH context and thus can be busy-waited.
> + *
> + * On RT, prevent a live lock when current preempted soft
> + * interrupt processing or prevents ksoftirqd from running by
> + * keeping flipping BH. If the tasklet runs on a different CPU
> + * then this has no effect other than doing the BH
> + * disable/enable dance for nothing. This is copied from
> + * kernel/softirq.c::tasklet_unlock_spin_wait().
> + */
s/tasklet/BH work/g
Although the comment is copied from kernel/softirq.c, but I can't
envision what the scenario is when the current task
"prevents ksoftirqd from running by keeping flipping BH"
since the @work is still executing or the tasklet is running.
> + while (!try_wait_for_completion(&barr.done)) {
> + if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> + local_bh_disable();
> + local_bh_enable();
> + } else {
> + cpu_relax();
> + }
> + }
> + } else {
> + wait_for_completion(&barr.done);
> + }
> +
> destroy_work_on_stack(&barr.work);
> return true;
> }
> @@ -4090,6 +4113,7 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
> */
> bool flush_work(struct work_struct *work)
> {
> + might_sleep();
> return __flush_work(work, false);
> }
> EXPORT_SYMBOL_GPL(flush_work);
> @@ -4179,6 +4203,11 @@ static bool __cancel_work_sync(struct work_struct *work, u32 cflags)
>
> ret = __cancel_work(work, cflags | WORK_CANCEL_DISABLE);
>
> + if (*work_data_bits(work) & WORK_OFFQ_BH)
> + WARN_ON_ONCE(in_hardirq());
When !PREEMPT_RT, this check is sufficient.
But when PREEMP_RT, it should be only in the contexts that allow
local_bh_disable() for synching a BH work, although I'm not sure
what check code is proper.
In PREEMPT_RT, local_bh_disable() is disallowed in not only hardirq
context but also !preemptible() context (I'm not sure about it).
__local_bh_disable_ip() (PREEMP_RT version) doesn't contain
full check except for "WARN_ON_ONCE(in_hardirq())" either.
Since the check is just for debugging, I'm OK with the current check.
Thanks
Lai
Powered by blists - more mailing lists