linux-kernel - [PATCH v2 0/5] workqueue: Debugging improvements

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230307125335.28805-1-pmladek@suse.com>
Date:   Tue,  7 Mar 2023 13:53:30 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Tejun Heo <tj@...nel.org>
Cc:     Lai Jiangshan <jiangshanlai@...il.com>,
        Michal Koutny <mkoutny@...e.com>, linux-kernel@...r.kernel.org,
        Petr Mladek <pmladek@...e.com>
Subject: [PATCH v2 0/5] workqueue: Debugging improvements

The workqueue watchdog provides a lot of information when a stall is
detected. The report says a lot about what workqueues and worker pools
are active and what is being blocked. Unfortunately, it does not provide
much information about what caused the stall.

In particular, it did not help me to get root of the following problems:

    + New workers were not created because the system reached PID limit.
      Admins limited it too much in a cloud.

    + A networking driver was not loaded because systemd killed modprobe
      when switching the root from initrd to the booted system.

      It was surprisingly quite reproducible. Interrupts are not handled
      immediately in kernel code. The wait in kthread_create_on_node()
      was one of few locations. So the race window evidently was not
      trivial.

1st patch fixes a misleading "hung" time report.

2nd, 3rd, and 4rd patches add warnings into create_worker() and
create_rescuer().

5th patch adds printing bracktraces of CPU-bound workers that might
block CPU-bound workqueues. The candidate is well defined to keep
the number of backtraces small. It always printed only the right one
during my testing.

The first 4 patches would have helped me to debug the real problems
that I met.

The 5th patch is theoretical. I did not see this case in practice.
But it looks realistic enough. And it worked very well when I
simulated the problem. IMHO, it should be pretty useful.

Changes against v1:

  + Used pr_err_once() instead of the complicated code synchronizing
    the error messages with the watchdog interval.

    I tried also the standard ratelimit API was not really usable.
    The synchronization with the watchdog was bad and the error
    messages touched/restarted the watchdog timestamp a non-reliable
    way. In fact, we wanted something like reset-able pr_once().

  + Added "cpu_stall" into struct worker_pool.

  + Renamed the functions for printing backtraces of hogging CPU-bound
    workers and cleaned up the code.

Petr Mladek (5):
  workqueue: Fix hung time report of worker pools
  workqueue: Warn when a new worker could not be created
  workqueue: Interrupted create_worker() is not a repeated event
  workqueue: Warn when a rescuer could not be created
  workqueue: Print backtraces from CPUs with hung CPU bound workqueues

 kernel/workqueue.c | 102 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 95 insertions(+), 7 deletions(-)

-- 
2.35.3