linux-kernel - Re: [PATCH 1/2] kernel/watchdog: add /sys/kernel/{hard,soft}lockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKPOu+_zurvzehn+Wp=gbQxafHP9jBEPM4NcrDzb6Kd2C0MmaA@mail.gmail.com>
Date: Sun, 4 May 2025 08:36:23 +0200
From: Max Kellermann <max.kellermann@...os.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: song@...nel.org, joel.granados@...nel.org, dianders@...omium.org, 
	cminyard@...sta.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count

On Sun, May 4, 2025 at 4:47 AM Andrew Morton <akpm@...ux-foundation.org> wrote:
> Documenation/, please?

Do you mean Documentation/ABI/testing/ ? (like
Documentation/ABI/testing/sysfs-kernel-oops_count)
I'll add that; I was confused by the directory name "testing" and
didn't expect to find actual documentation there.

> >  Having this is useful for monitoring tools.
>
> Useful how?  Use cases?  Examples?

To detect whether the machine is healthy. If the kernel has
experienced a soft lockup, it's probably due to a kernel bug, and I'd
like to detect that quickly and easily. There is currently no way to
detect that, other than parsing dmesg. Or observing indirect effects:
such as certain tasks not responding, but then I need to observe all
tasks. I'd rather be able to detect the primary cause easily - just
like some people decided that they want to observe an oops and a
warning counter.

We always run the latest stable kernel on our production servers, and
this has brought great sorrow for the last year (I think the big netfs
drama began in 6.9 or so when the pgpriv2 refactoring began). There
have been numerous netfs/NFS/Ceph regressions, we had just as many
production outages, and the maintainers wouldn't respond to my bug
reports, so I had to figure it all out myself.
The latest regression that quickly took down our servers was a
"stable" backport of a performance optimization for epoll in 6.14.4,
leading to soft lockups in ep_poll(), see
https://lore.kernel.org/lkml/20250429185827.3564438-1-max.kellermann@ionos.com/
- but we observed it only after everything had already fallen apart.
Since our main process has switched from epoll to io_uring, only
second-order processes were falling apart. Had we had a soft lockup
counter, we could have noticed it earlier.

> A proposal to permanently extend Linux's userspace API requires better
> justification than an unsubstantiated assertion, surely?

The commits that added warn_count/oops_count literally only said "is a
fairly interesting signal". See commits 9db89b411170 ("exit: Expose
"oops_count" to sysfs") and 8b05aa263361 ("panic: Expose "warn_count"
to sysfs"). That's quite an unsubstantiated assertion, too, isn't it?

I agree with you, but I thought the point for a soft lockup counter
was trivial enough to see, and I didn't think you needed more
justification than the other counters.

Max