lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Tue, 27 Feb 2024 07:28:11 -1000
From: Tejun Heo <tj@...nel.org>
To: jiangshanlai@...il.com
Cc: torvalds@...ux-foundation.org,
	linux-kernel@...r.kernel.org,
	allen.lkml@...il.com,
	kernel-team@...a.com,
	boqun.feng@...il.com,
	tglx@...utronix.de,
	peterz@...radead.org,
	romain.perier@...il.com,
	mingo@...nel.org
Subject: [PATCHSET v3 wq/6.10] workqueue: Implement disable/enable_work()

Hello,

Changes since v2 (http://lkml.kernel.org/r/20240221174333.700197-1-tj@kernel.org):

- enable_work() was incorrectly using local_irq_enable() instead of
  local_irq_restore(). Fixed. (Lai, Boqun).

- In __flush_work(), instead of usring the pool returned from
  start_flush_work() and testing it to tell whether the work item being
  flushed is BH, use the work item's WORK_OFFQ_BH bit. This is safe because
  the test is only need when @from_cancel is %true and we know that the work
  item must be off queue and thus its OFFQ bits are valid at that point.
  This makes the code a bit simpler and makes
  0005-workqueue-Update-how-start_flush_work-is-called.patch unnecessary.
  (Lai)

- There were discussions on whether we want to name the syncing variant
  disable_work() and async one dsiable_work_no_sync() instead of naming them
  disable_work_sync() and disable_work() respectively. While there are clear
  benefits in doing so (continuity from tasklet interface, shorter name for
  something used more widely), there is also the clear downside of breaking
  consistency with cancel_work_sync() and cancel_work(), which the disable
  functionality directly builds upon. I'm still leaning towards keeping
  things consistent with existing workqueue interface but open to listening
  to arguments, so if anyone has strong opinions, please pipe up.

Changes since v1 (http://lkml.kernel.org/20240216180559.208276-1-tj@kernel.org):

- Added missing disabled checks to queue_work_node() and
  mod_delayed_work_on(). (Lai)

- __flush_work() was derefing RCU protected pool pointer outside
  rcu_read_lock() section. Deref and test inside and remember the result.
  (Lai)

- queue_rcu_work() doesn't need a disabled check as rcu_work doesn't support
  canceling and disabling; however, in case users reach into rcu_work and
  invoke disable on the embedded work item, add disabled check to
  queue_rcu_work() and trigger warning if disabled. (Lai)

- The first ten cleanup patches have been applied to wq/for-6.9 and dropped
  from this series.

4cb1ef64609f ("workqueue: Implement BH workqueues to eventually replace
tasklets") implemented workqueues that execute work items in the BH context
with the goal of eventually replacing tasklet.

While the existing workqueue API covers the basic queueing and canceling
operations, tasklet also has tasklet_disable*() which blocks the execution
of the tasklet until it's re-enabled with tasklet_enable(). The interface if
fairly widely used and workqueue currently doesn't have a counterpart.

This patchset implements disable/enable_work() and the delayed_work
counterparts to address the gap. The ability to block future executions is
something which some users asked for in the past, and, while not essential
as the caller can and often has to shutdown the queuer anyway, it's a nice
convenience to have. Also, timer grew a similar feature recently with
timer_shutdown().

- tasklet_disable() keeps disable depth so that a tasklet can be disabled
  and re-enabled by multiple parties at the same time. Workqueue is updated
  to carry 16bit disable count in work->data while the work item is not
  queued. When non-zero, attempts to queue the work item fail.

- The cancel_work_sync() path used WORK_OFFQ_CANCELING to synchronize
  multiple cancel_sync attempts. This added a completely separate wait
  mechanism which wasn't very congruent with the rest of workqueue. This was
  because the canceling state was carried in a way which couldn't
  accommodate multiple concurrent uses. This mechanism is replaced by
  disable - cancel_sync now simply disables the work item, flushes and
  re-enables it.

- There is a wart in how tasklet_disable/enable() works. When a tasklet is
  disabled, if the tasklet is queued, it keeps the softirq constantly raised
  until the tasklet is re-enabled and executed. This makes disabling
  unnecessarily expensive and brittle. The only busy looping workqueue's
  implementation does is on the party that's trying to cancel_sync or
  disable_sync to wait for the completion of the currently executing
  instance, which should be safe as long as it's from process and BH
  contexts.

- A disabled tasklet remembers whether it was queued while disabled and
  starts executing when re-enabled. It turns out doing this with work items
  is challenging as there are a lot more to remember and the synchronization
  becomes more complicated too. Looking at the use cases and given the
  continuity from how cancel_work_sync() works, it seems better to just
  ignore queueings which happen while a work item is disabled and require
  the users to explicitly queue the work item after re-enabling as
  necessary. Most users should be able to re-enqueue unconditionally after
  enabling.

After the changes in this patchset, BH workqueues cover most of the
functionalities provided by tasklets. The following is how the two APIs map:

- tasklet_setup/init()		-> INIT_WORK()
- tasklet_schedule()		-> queue_work(system_bh_wq, ...)
- tasklet_hi_schedule()		-> queue_work(system_bh_highpri_wq, ...)
- tasklet_disable_nosync()	-> disable_work()
- tasklet_disable[_in_atomic]()	-> disable_work_sync()
- tasklet_enable()		-> enable_work() + queue_work()
- tasklet_kill()		-> cancel_work_sync()

Note that unlike tasklet_enable(), enable_work() doesn't queue the work item
automatically according to whether the work item was queued while disabled.
While the caller can track this separately, unconditionally scheduling the
work item after enable_work() returns %true should work for most users.

This patchset contains the following 6 patches:

 0001-workqueue-Preserve-OFFQ-bits-in-cancel-_sync-paths.patch
 0002-workqueue-Implement-disable-enable-for-delayed-work-.patch
 0003-workqueue-Remove-WORK_OFFQ_CANCELING.patch
 0004-workqueue-Remember-whether-a-work-item-was-on-a-BH-w.patch
 0005-workqueue-Allow-cancel_work_sync-and-disable_work-fr.patch
 0006-r8152-Convert-from-tasklet-to-BH-workqueue.patch

Given how many invasive workqueue changes are already queued for v6.9 and
the subtle nature of these patches, I think it'd be best to defer this
patchset to v6.10 so that we can use v6.9 as an intermediate verification
point.

0001-0002 implement disable_work() and enable_work(). At this stage, all
disable[_sync]_work and enable_work operations might_sleep(). disable_work()
and enable_work() due to CANCELING synchronization described above.
disable_work_sync() also needs to wait for the in-flight work item to finish
which requires blocking.

0003 replaces CANCELING with internal use of disble/enable. This removes one
ugliness from workqueue code and allows disable_work() and enable_work() to
be used from atomic contexts.

0004-0005 implement busy-wait for BH work items when they're being canceled
thus allowing cancel_work_sync() and disable_work_sync() to be called from
atomic contexts for them. This makes workqueue interface a superset of
tasklet and also makes BH workqueues easier to live with.

0006 converts drivers/net/r8152.c from tasklet to BH workqueue as a
demonstration. It seems to work fine.

The patchset is on top of wq/for-6.9 (ccdec92198df ("workqueue: Control
intensive warning threshold through cmdline")) and also available in the
following git branch:

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git disable_work-v3

diffstat follows. Thanks.

 drivers/net/usb/r8152.c   |   44 ++++----
 include/linux/workqueue.h |   23 +++-
 kernel/workqueue.c        |  384 ++++++++++++++++++++++++++++++++++++++++++++++---------------------------

--
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ