linux-kernel - [PATCHSET wq/for-6.5] workqueue: Implement automatic CPU intensive detection and add monitoring

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230418205159.724789-1-tj@kernel.org>
Date:   Tue, 18 Apr 2023 10:51:54 -1000
From:   Tejun Heo <tj@...nel.org>
To:     jiangshanlai@...il.com
Cc:     torvalds@...ux-foundation.org, peterz@...radead.org,
        linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: [PATCHSET wq/for-6.5] workqueue: Implement automatic CPU intensive detection and add monitoring

Hello,

To reduce the number of concurrent worker threads, workqueue holds back
starting per-cpu work items while the previous work item stays in the
RUNNING state. As such a per-cpu work item which consumes a lot of CPU
cycles, even if it has cond_resched()'s in the right places, can stall other
per-cpu work items.

To support per-cpu work items that may occupy the CPU for a substantial
period of time, workqueue has WQ_CPU_INTENSIVE flag which exempts work items
issued through the marked workqueue from concurrency management - they're
started immediately and don't block other work items. While this works, it's
error-prone in that a workqueue user can easily forget to set the flag or
set it unnecessarily. Furthermore, the impacts of the wrong flag setting can
be rather indirect and challenging to root-cause.

This patchset makes workqueue auto-detect CPU intensive work items based on
CPU consumption. If a work item consumes more than the threshold (5ms by
default) of CPU time, it's automatically marked as CPU intensive when it
gets scheduled out which unblocks starting of pending per-cpu work items.

The mechanism isn't foolproof in that the detection delays can add up if
many CPU-hogging work items are queued at the same time. However, in such
situations, the bigger problem likely is the CPU being saturated with
per-cpu work items and the solution would be making them UNBOUND. Future
changes will make UNBOUND workqueues more attractive by improving their
locality behaviors and eventually remove the explicit WQ_CPU_INTENSIVE flag.

While at it, add statistics and a monitoring script. Lack of visibility has
always been a bit of pain point when debugging workqueue related issues and
with this change and more drastic ones planned for workqueue, this is a good
time to address the shortcoming.

This patchset was born out of the discussion in the following thread:

 https://lkml.kernel.org/r/CAHk-=wgE9kORADrDJ4nEsHHLirqPCZ1tGaEPAZejHdZ03qCOGg@mail.gmail.com

and contains the following five patches:

 0001-workqueue-sched-Notify-workqueue-of-scheduling-of-RU.patch
 0002-workqueue-Re-order-struct-worker-fields.patch
 0003-workqueue-Move-worker_set-clr_flags-upwards.patch
 0004-workqueue-Automatically-mark-CPU-hogging-work-items-.patch
 0005-workqueue-Add-pwq-stats-and-a-monitoring-script.patch

and also available in the following git branch:

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git auto-cpu-intensive

diffstat follows. Thanks.

 Documentation/core-api/workqueue.rst |   30 ++++++++
 kernel/sched/core.c                  |   18 +----
 kernel/workqueue.c                   |  221 +++++++++++++++++++++++++++++++++++++++-----------------------
 kernel/workqueue_internal.h          |   14 +--
 tools/workqueue/wq_monitor.py        |  148 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 333 insertions(+), 98 deletions(-)

--
tejun