linux-kernel - Re: [PATCHSET v2 wq/6.10] workqueue: Implement disable/enable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZdcQ1jfOKNTlbB6C@slm.duckdns.org>
Date: Wed, 21 Feb 2024 23:16:06 -1000
From: Tejun Heo <tj@...nel.org>
To: Lai Jiangshan <jiangshanlai@...il.com>
Cc: torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
	allen.lkml@...il.com, kernel-team@...a.com,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCHSET v2 wq/6.10] workqueue: Implement disable/enable_work()

Hello,

On Thu, Feb 22, 2024 at 11:33:24AM +0800, Lai Jiangshan wrote:
> From the last patch:
> > - tasklet_disable_nosync()      -> disable_work()
> > - tasklet_disable[_in_atomic]() -> disable_work_sync()
> 
> I think it is a misuse-prone conversion.
> 
> A developer familiar with tasklet_disable() might happily use disable_work()
> and, to her/his surprise, leave the running works unsynced.
> 
> And tasklet_disable_nosync() is used at only 3 places while tasklet_disable()
> is used a lot.  I think the shorter name for the most common cases is better.

While I agree that this can be argued either way, keeping the interface
congruent with the existing cancel_work_sync() and friends seems a lot more
important to me. It can be a bit more confusing for users who are used to
tasklet interface but then again we aren't gonna rename cancel_work_sync()
to kill_work() and the conversion overhead isn't all that significant or
lasting. However, if we break the consnistency within workqueue API, that's
a source of lasting confusion.

> Moreover, IMHO the unsynchronized variants of tasklet/work disabling functions
> never have a strong scenario. I think it should be discouraged.
>
> Although it will be inconsistent with the name of cancel_work[_sync](),
> I still suggest:
> tasklet_disable_nosync() -> disable_work_nosync()
> tasklet_disable() -> disable_work().
>
> Even cancel_work_sync() is used a lot more than cancel_work(), so I
> also suggest rename cancel_work() to cancel_work_nosync() and leave
> cancel_work_sync() unchanged (at least for a while).

Maybe but I'm not sure this would bring meaingful benefits at this point.
Besides, even if we do something like this, we should still keep cancel and
disable names in sync. Let's leave this topic for some other day.

> [changed topic:]
> 
> I feel uncomfortable with tasklet_disable_in_atomic() implicitly
> being implemented in disable_work_sync().
> 
> I think it is a revert of the commit ca5f62511895 ("tasklets: Provide
> tasklet_disable_in_atomic()") in which tglx discouraged the usage of
> tasklet_disable_in_atomic() and marked it "error prone".
> 
> And even tasklet_disable_in_atomic() is implemented in disable_work_sync(),
> I prefer to sleepable-wait than spinning-wait when disable_work_sync() is
> called in a sleepable context for BH work item.
> 
> All the above is just my feeling, not reasoning, nor rejection of the patches.

So, tasklet atomic wait isn't great in that it busy-spins softirq on the CPU
that the disabled tasklet is queued on. The only busy waiting that happens
in the workqueue implementation is the waiter waiting for the target work
item which is currently executing to finish. This shouldn't be error-prone
as long as the calling context is checked.

I'm not strongly against putting atomic disable in a separate interface. I
just couldn't think of strong enough justifications for doing so. If it's
safe and the busy wait is bound by the target work item's execution time,
might as well make the interface simpler and give users one less thing to
think about.

Thanks.

-- 
tejun