[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1344908986-18152-1-git-send-email-tj@kernel.org>
Date: Mon, 13 Aug 2012 18:49:40 -0700
From: Tejun Heo <tj@...nel.org>
To: linux-kernel@...r.kernel.org
Cc: torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
mingo@...hat.com, lauro.venancio@...nbossa.org, jak@...-linux.org
Subject: [PATCHSET] workqueue: make all workqueues non-reentrant
Hello,
Unless explicitly specified, a workqueued is reentrant, which might
not even be the correct term. If a work item is executing on a CPU
but not pending anywhere, the work item can be queued on a different
CPU, and that CPU might start executing the work item whether the
previous execution on the first CPU is finished or not. In short,
while a work item can be queued on only one CPU, it may be executing
concurrently on any subset of the CPUs in the system.
This behavior in itself is already confusing and error-prone; however,
this causes much worse complication in flush operations. Because a
given work item could be executing on any number of CPUs, even if no
further queueing happens after invocation, flush_work() can't
guarantee that the work item is idle on return without checking all
CPUs. Being a relatively common operation, checking all CPUs on each
invocation was deemed too costly. It only verifies the instance
queued last and there's a separate version which checks all CPUs -
flush_work_sync().
Also, due to the way last CPU is tracked, if there are parallel
queueing operations, the following sequence
schedule_work(A);
flush_work(A);
can't guarantee that the instance queued right above is finished. If
someone else queues A on another CPU inbetween, flush_work() either
becomes noop or flushes that instance which may finish earlier than
the one from the above.
In the end, workqueue is a pain in the ass to get completely correct
and the breakages are very subtle. Depending on queueing pattern,
assuming non-reentrancy might work fine most of the time. The effect
of using flush_work() where flush_work_sync() should be used could be
a lot more subtle. flush_work() becoming noop would happen extremely
rarely for most users but it definitely is there.
A tool which is used as widely as workqueue shouldn't be this
difficult and error-prone. This is just silly.
Timer does the right thing. It *always* checks whether a timer is
still executing on the previous CPU and queues it there if so much
like how WQ_NON_REENTRANT makes workqueue behave. WQ_NON_REENTRANT
guarantees that any given work item is executing on only one CPU at
maximum and if both pending and executing, both are on the same CPU.
This makes the behaviors of work item execution and flush_work() much
more sane and flush_work_sync() unnecessary.
This patchset enforces WQ_NON_REENTRANT behavior on all workqueues and
simplifies workqueue API accordingly.
This adds an additional busy_hash lookup if the work item was
previously queued on a different CPU. This shouldn't be noticeable
under any sane workload. Work item queueing isn't a very
high-frequency operation and they don't jump across CPUs all the time.
In a micro benchmark to exaggerate this difference - measuring the
time it takes for two work items to repeatedly jump between two CPUs a
number (10M) of times with busy_hash table densely populated, the
difference was around 3%.
While the overhead is measureable, it is only visible in pathological
cases and the difference isn't huge. This change brings much needed
sanity to workqueue and makes its behavior consistent with timer. I
think this is the right tradeoff to make.
This patchset contains the following six patches.
0001-workqueue-make-all-workqueues-non-reentrant.patch
0002-workqueue-gut-flush-_delayed-_work_sync.patch
0003-workqueue-gut-system_nrt-_freezable-_wq.patch
0004-workqueue-deprecate-flush-_delayed-_work_sync.patch
0005-workqueue-deprecate-system_nrt-_freezable-_wq.patch
0006-workqueue-deprecate-WQ_NON_REENTRANT.patch
0001 makes all workqueues non-reentrant.
0002-0003 simplify workqueue API accordingly.
0004-0006 deprecate now unnecessary parts of workqueue API and convert
their users. Most conversions are straight-forward but 0006 contains
some non-trivial ones. If you work on NFC and drivers/staging/nvec,
please review them.
This patchset is on top of
wq/for-3.7 1265057fa02c7bed3b6d9ddc8a2048065a370364
+ [1] "timer: clean up initializers and implement irqsafe" patchset
+ [2] "workqueue: use irqsafe timer in delayed_work" patchset
and available in the following git branch.
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-wq-always-nrt
Thanks. diffstat follows.
Documentation/workqueue.txt | 14 --
arch/arm/mach-pxa/sharpsl_pm.c | 4
arch/arm/plat-omap/mailbox.c | 2
arch/sh/drivers/push-switch.c | 2
block/blk-throttle.c | 7 -
block/genhd.c | 10 -
drivers/block/xen-blkfront.c | 4
drivers/cdrom/gdrom.c | 2
drivers/char/sonypi.c | 2
drivers/char/tpm/tpm.c | 4
drivers/firewire/core-transaction.c | 3
drivers/gpu/drm/drm_crtc_helper.c | 6
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2
drivers/gpu/drm/i915/i915_dma.c | 6
drivers/gpu/drm/nouveau/nouveau_gpio.c | 2
drivers/gpu/drm/radeon/radeon_irq_kms.c | 2
drivers/gpu/drm/vmwgfx/vmwgfx_fb.c | 2
drivers/hid/hid-picolcd.c | 2
drivers/hid/hid-wiimote-ext.c | 2
drivers/input/touchscreen/wm831x-ts.c | 2
drivers/isdn/mISDN/hwchannel.c | 4
drivers/leds/leds-lm3533.c | 6
drivers/leds/leds-lp8788.c | 2
drivers/leds/leds-wm8350.c | 2
drivers/macintosh/ams/ams-core.c | 2
drivers/md/dm-crypt.c | 10 -
drivers/md/dm-kcopyd.c | 3
drivers/md/dm-mpath.c | 2
drivers/md/dm-raid1.c | 5
drivers/md/dm-stripe.c | 2
drivers/md/dm.c | 3
drivers/media/dvb/dvb-core/dvb_net.c | 4
drivers/media/dvb/mantis/mantis_evm.c | 2
drivers/media/dvb/mantis/mantis_uart.c | 2
drivers/media/video/bt8xx/bttv-driver.c | 2
drivers/media/video/cx18/cx18-driver.c | 2
drivers/media/video/cx231xx/cx231xx-cards.c | 2
drivers/media/video/cx23885/cx23885-input.c | 6
drivers/media/video/cx88/cx88-mpeg.c | 2
drivers/media/video/em28xx/em28xx-cards.c | 2
drivers/media/video/omap24xxcam.c | 6
drivers/media/video/saa7134/saa7134-core.c | 2
drivers/media/video/saa7134/saa7134-empress.c | 2
drivers/media/video/tm6000/tm6000-cards.c | 2
drivers/mfd/menelaus.c | 4
drivers/misc/ioc4.c | 2
drivers/mmc/core/host.c | 4
drivers/mmc/host/dw_mmc.c | 3
drivers/mtd/mtdoops.c | 4
drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c | 2
drivers/net/ethernet/neterion/vxge/vxge-main.c | 2
drivers/net/ethernet/sun/cassini.c | 2
drivers/net/ethernet/sun/niu.c | 2
drivers/net/virtio_net.c | 12 -
drivers/net/wireless/hostap/hostap_ap.c | 4
drivers/net/wireless/hostap/hostap_hw.c | 10 -
drivers/nfc/pn533.c | 5
drivers/power/collie_battery.c | 2
drivers/power/tosa_battery.c | 2
drivers/power/wm97xx_battery.c | 2
drivers/power/z2_battery.c | 2
drivers/regulator/core.c | 2
drivers/scsi/arcmsr/arcmsr_hba.c | 4
drivers/scsi/ipr.c | 2
drivers/scsi/pmcraid.c | 2
drivers/scsi/qla2xxx/qla_target.c | 2
drivers/staging/nvec/nvec.c | 10 -
drivers/staging/nvec/nvec.h | 2
drivers/staging/omapdrm/omap_drv.c | 3
drivers/tty/hvc/hvsi.c | 2
drivers/tty/ipwireless/hardware.c | 2
drivers/tty/ipwireless/network.c | 4
drivers/tty/serial/kgdboc.c | 2
drivers/tty/serial/omap-serial.c | 2
drivers/tty/tty_ldisc.c | 6
drivers/usb/atm/speedtch.c | 2
drivers/usb/atm/ueagle-atm.c | 2
drivers/usb/gadget/u_ether.c | 2
drivers/usb/host/ohci-hcd.c | 2
drivers/usb/otg/isp1301_omap.c | 2
fs/affs/super.c | 2
fs/dlm/ast.c | 5
fs/gfs2/lock_dlm.c | 2
fs/gfs2/main.c | 2
fs/gfs2/super.c | 2
fs/hfs/inode.c | 2
fs/ncpfs/inode.c | 6
fs/ocfs2/cluster/quorum.c | 2
fs/xfs/xfs_super.c | 4
fs/xfs/xfs_sync.c | 2
include/linux/workqueue.h | 37 ++++-
kernel/srcu.c | 4
kernel/workqueue.c | 150 +++---------------------
net/9p/trans_fd.c | 2
net/ceph/messenger.c | 2
net/dsa/dsa.c | 2
net/nfc/core.c | 6
net/nfc/hci/core.c | 8 -
net/nfc/hci/shdlc.c | 4
net/nfc/llcp/llcp.c | 18 --
security/keys/gc.c | 8 -
security/keys/key.c | 2
sound/i2c/other/ak4113.c | 2
sound/i2c/other/ak4114.c | 2
sound/pci/oxygen/oxygen_lib.c | 8 -
sound/soc/codecs/wm8350.c | 2
sound/soc/codecs/wm8753.c | 2
sound/soc/soc-core.c | 6
virt/kvm/eventfd.c | 2
109 files changed, 221 insertions(+), 351 deletions(-)
--
tejun
[1] http://thread.gmane.org/gmane.linux.kernel/1340224
[2] http://thread.gmane.org/gmane.linux.kernel/1340333
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists