lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1344908986-18152-1-git-send-email-tj@kernel.org>
Date:	Mon, 13 Aug 2012 18:49:40 -0700
From:	Tejun Heo <tj@...nel.org>
To:	linux-kernel@...r.kernel.org
Cc:	torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
	mingo@...hat.com, lauro.venancio@...nbossa.org, jak@...-linux.org
Subject: [PATCHSET] workqueue: make all workqueues non-reentrant

Hello,

Unless explicitly specified, a workqueued is reentrant, which might
not even be the correct term.  If a work item is executing on a CPU
but not pending anywhere, the work item can be queued on a different
CPU, and that CPU might start executing the work item whether the
previous execution on the first CPU is finished or not.  In short,
while a work item can be queued on only one CPU, it may be executing
concurrently on any subset of the CPUs in the system.

This behavior in itself is already confusing and error-prone; however,
this causes much worse complication in flush operations.  Because a
given work item could be executing on any number of CPUs, even if no
further queueing happens after invocation, flush_work() can't
guarantee that the work item is idle on return without checking all
CPUs.  Being a relatively common operation, checking all CPUs on each
invocation was deemed too costly.  It only verifies the instance
queued last and there's a separate version which checks all CPUs -
flush_work_sync().

Also, due to the way last CPU is tracked, if there are parallel
queueing operations, the following sequence

	schedule_work(A);
	flush_work(A);

can't guarantee that the instance queued right above is finished.  If
someone else queues A on another CPU inbetween, flush_work() either
becomes noop or flushes that instance which may finish earlier than
the one from the above.

In the end, workqueue is a pain in the ass to get completely correct
and the breakages are very subtle.  Depending on queueing pattern,
assuming non-reentrancy might work fine most of the time.  The effect
of using flush_work() where flush_work_sync() should be used could be
a lot more subtle.  flush_work() becoming noop would happen extremely
rarely for most users but it definitely is there.

A tool which is used as widely as workqueue shouldn't be this
difficult and error-prone.  This is just silly.

Timer does the right thing.  It *always* checks whether a timer is
still executing on the previous CPU and queues it there if so much
like how WQ_NON_REENTRANT makes workqueue behave.  WQ_NON_REENTRANT
guarantees that any given work item is executing on only one CPU at
maximum and if both pending and executing, both are on the same CPU.
This makes the behaviors of work item execution and flush_work() much
more sane and flush_work_sync() unnecessary.

This patchset enforces WQ_NON_REENTRANT behavior on all workqueues and
simplifies workqueue API accordingly.

This adds an additional busy_hash lookup if the work item was
previously queued on a different CPU.  This shouldn't be noticeable
under any sane workload.  Work item queueing isn't a very
high-frequency operation and they don't jump across CPUs all the time.
In a micro benchmark to exaggerate this difference - measuring the
time it takes for two work items to repeatedly jump between two CPUs a
number (10M) of times with busy_hash table densely populated, the
difference was around 3%.

While the overhead is measureable, it is only visible in pathological
cases and the difference isn't huge.  This change brings much needed
sanity to workqueue and makes its behavior consistent with timer.  I
think this is the right tradeoff to make.

This patchset contains the following six patches.

 0001-workqueue-make-all-workqueues-non-reentrant.patch
 0002-workqueue-gut-flush-_delayed-_work_sync.patch
 0003-workqueue-gut-system_nrt-_freezable-_wq.patch
 0004-workqueue-deprecate-flush-_delayed-_work_sync.patch
 0005-workqueue-deprecate-system_nrt-_freezable-_wq.patch
 0006-workqueue-deprecate-WQ_NON_REENTRANT.patch

0001 makes all workqueues non-reentrant.

0002-0003 simplify workqueue API accordingly.

0004-0006 deprecate now unnecessary parts of workqueue API and convert
their users.  Most conversions are straight-forward but 0006 contains
some non-trivial ones.  If you work on NFC and drivers/staging/nvec,
please review them.

This patchset is on top of

  wq/for-3.7 1265057fa02c7bed3b6d9ddc8a2048065a370364
+ [1] "timer: clean up initializers and implement irqsafe" patchset
+ [2] "workqueue: use irqsafe timer in delayed_work" patchset

and available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-wq-always-nrt

Thanks.  diffstat follows.

 Documentation/workqueue.txt                     |   14 --
 arch/arm/mach-pxa/sharpsl_pm.c                  |    4 
 arch/arm/plat-omap/mailbox.c                    |    2 
 arch/sh/drivers/push-switch.c                   |    2 
 block/blk-throttle.c                            |    7 -
 block/genhd.c                                   |   10 -
 drivers/block/xen-blkfront.c                    |    4 
 drivers/cdrom/gdrom.c                           |    2 
 drivers/char/sonypi.c                           |    2 
 drivers/char/tpm/tpm.c                          |    4 
 drivers/firewire/core-transaction.c             |    3 
 drivers/gpu/drm/drm_crtc_helper.c               |    6 
 drivers/gpu/drm/exynos/exynos_drm_g2d.c         |    2 
 drivers/gpu/drm/i915/i915_dma.c                 |    6 
 drivers/gpu/drm/nouveau/nouveau_gpio.c          |    2 
 drivers/gpu/drm/radeon/radeon_irq_kms.c         |    2 
 drivers/gpu/drm/vmwgfx/vmwgfx_fb.c              |    2 
 drivers/hid/hid-picolcd.c                       |    2 
 drivers/hid/hid-wiimote-ext.c                   |    2 
 drivers/input/touchscreen/wm831x-ts.c           |    2 
 drivers/isdn/mISDN/hwchannel.c                  |    4 
 drivers/leds/leds-lm3533.c                      |    6 
 drivers/leds/leds-lp8788.c                      |    2 
 drivers/leds/leds-wm8350.c                      |    2 
 drivers/macintosh/ams/ams-core.c                |    2 
 drivers/md/dm-crypt.c                           |   10 -
 drivers/md/dm-kcopyd.c                          |    3 
 drivers/md/dm-mpath.c                           |    2 
 drivers/md/dm-raid1.c                           |    5 
 drivers/md/dm-stripe.c                          |    2 
 drivers/md/dm.c                                 |    3 
 drivers/media/dvb/dvb-core/dvb_net.c            |    4 
 drivers/media/dvb/mantis/mantis_evm.c           |    2 
 drivers/media/dvb/mantis/mantis_uart.c          |    2 
 drivers/media/video/bt8xx/bttv-driver.c         |    2 
 drivers/media/video/cx18/cx18-driver.c          |    2 
 drivers/media/video/cx231xx/cx231xx-cards.c     |    2 
 drivers/media/video/cx23885/cx23885-input.c     |    6 
 drivers/media/video/cx88/cx88-mpeg.c            |    2 
 drivers/media/video/em28xx/em28xx-cards.c       |    2 
 drivers/media/video/omap24xxcam.c               |    6 
 drivers/media/video/saa7134/saa7134-core.c      |    2 
 drivers/media/video/saa7134/saa7134-empress.c   |    2 
 drivers/media/video/tm6000/tm6000-cards.c       |    2 
 drivers/mfd/menelaus.c                          |    4 
 drivers/misc/ioc4.c                             |    2 
 drivers/mmc/core/host.c                         |    4 
 drivers/mmc/host/dw_mmc.c                       |    3 
 drivers/mtd/mtdoops.c                           |    4 
 drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c |    2 
 drivers/net/ethernet/neterion/vxge/vxge-main.c  |    2 
 drivers/net/ethernet/sun/cassini.c              |    2 
 drivers/net/ethernet/sun/niu.c                  |    2 
 drivers/net/virtio_net.c                        |   12 -
 drivers/net/wireless/hostap/hostap_ap.c         |    4 
 drivers/net/wireless/hostap/hostap_hw.c         |   10 -
 drivers/nfc/pn533.c                             |    5 
 drivers/power/collie_battery.c                  |    2 
 drivers/power/tosa_battery.c                    |    2 
 drivers/power/wm97xx_battery.c                  |    2 
 drivers/power/z2_battery.c                      |    2 
 drivers/regulator/core.c                        |    2 
 drivers/scsi/arcmsr/arcmsr_hba.c                |    4 
 drivers/scsi/ipr.c                              |    2 
 drivers/scsi/pmcraid.c                          |    2 
 drivers/scsi/qla2xxx/qla_target.c               |    2 
 drivers/staging/nvec/nvec.c                     |   10 -
 drivers/staging/nvec/nvec.h                     |    2 
 drivers/staging/omapdrm/omap_drv.c              |    3 
 drivers/tty/hvc/hvsi.c                          |    2 
 drivers/tty/ipwireless/hardware.c               |    2 
 drivers/tty/ipwireless/network.c                |    4 
 drivers/tty/serial/kgdboc.c                     |    2 
 drivers/tty/serial/omap-serial.c                |    2 
 drivers/tty/tty_ldisc.c                         |    6 
 drivers/usb/atm/speedtch.c                      |    2 
 drivers/usb/atm/ueagle-atm.c                    |    2 
 drivers/usb/gadget/u_ether.c                    |    2 
 drivers/usb/host/ohci-hcd.c                     |    2 
 drivers/usb/otg/isp1301_omap.c                  |    2 
 fs/affs/super.c                                 |    2 
 fs/dlm/ast.c                                    |    5 
 fs/gfs2/lock_dlm.c                              |    2 
 fs/gfs2/main.c                                  |    2 
 fs/gfs2/super.c                                 |    2 
 fs/hfs/inode.c                                  |    2 
 fs/ncpfs/inode.c                                |    6 
 fs/ocfs2/cluster/quorum.c                       |    2 
 fs/xfs/xfs_super.c                              |    4 
 fs/xfs/xfs_sync.c                               |    2 
 include/linux/workqueue.h                       |   37 ++++-
 kernel/srcu.c                                   |    4 
 kernel/workqueue.c                              |  150 +++---------------------
 net/9p/trans_fd.c                               |    2 
 net/ceph/messenger.c                            |    2 
 net/dsa/dsa.c                                   |    2 
 net/nfc/core.c                                  |    6 
 net/nfc/hci/core.c                              |    8 -
 net/nfc/hci/shdlc.c                             |    4 
 net/nfc/llcp/llcp.c                             |   18 --
 security/keys/gc.c                              |    8 -
 security/keys/key.c                             |    2 
 sound/i2c/other/ak4113.c                        |    2 
 sound/i2c/other/ak4114.c                        |    2 
 sound/pci/oxygen/oxygen_lib.c                   |    8 -
 sound/soc/codecs/wm8350.c                       |    2 
 sound/soc/codecs/wm8753.c                       |    2 
 sound/soc/soc-core.c                            |    6 
 virt/kvm/eventfd.c                              |    2 
 109 files changed, 221 insertions(+), 351 deletions(-)

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1340224 
[2] http://thread.gmane.org/gmane.linux.kernel/1340333
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ