[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAD=FV=VOjzdBvyx9-g=Mvv6SJvur5ogtUbRfR3oCFpQoQ4VK3w@mail.gmail.com>
Date: Wed, 8 Jan 2025 09:27:17 -0800
From: Doug Anderson <dianders@...omium.org>
To: wzs <wangzhengshu39@...il.com>
Cc: tglx@...utronix.de, liusong@...ux.alibaba.com, akpm@...ux-foundation.org,
pmladek@...e.com, kernelfans@...il.com, deller@....de, npiggin@...il.com,
tsbogend@...ha.franken.de, James.Bottomley@...senpartnership.com,
jan.kiszka@...mens.com, linux-kernel@...r.kernel.org,
linux-mips@...r.kernel.org, linux-parisc@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, yaoma@...ux.alibaba.com
Subject: Re: watchdog: BUG: soft lockup
Hi,
On Sun, Dec 22, 2024 at 10:32 PM wzs <wangzhengshu39@...il.com> wrote:
>
> Hello,
> when fuzzing the Linux kernel,
> I triggered many "watch: BUG: soft lockup" warnings.
> I am not sure whether this is an issue with the kernel or with the
> fuzzing program I ran.
> (The same fuzzing program, when tested on kernel versions from
> Linux-6.7.0 to 6.12.0, triggers the 'watchdog: BUG: soft lockup'
> warning on some versions, while others do not. Linux 6.12.0 is the
> latest stable release where this error occurs.)
>
> The bug information I provided below is from the Linux-6.12.0 kernel.
> If you need bug information from other versions, I would be happy to provide it.
>
> kernel config :https://pastebin.com/i4LPXNAN
> console output :https://pastebin.com/uKVpvJ78
IMO it's nearly always a bug if userspace can cause the kernel to soft
lockup. I'd expect this isn't a bug in the soft lockup detector but a
problem in whatever part of the kernel you're fuzzing. For some
details of the soft lockup detector, see
`Documentation/admin-guide/lockup-watchdogs.rst`.
Presumably you're fuzzing the kernel in a way that causes it to enter
a big loop while preemption is disabled, or something like that.
Presumably the kernel should be detecting something invalid that
userspace did and that would keep it from looping so long.
I tried looking at your pastebin and probably what's going on is
somewhere hidden in there, but unfortunately the beginning of the logs
are a bit jumbled since it looks like the RCU warning and the soft
lockup warning happened at about the same time and their stuff is
jumbled. There's also a lot of tasks to go through. Honestly, it's
probably less work just to look at whatever you were trying to fuzz to
help you pinpoint the problem.
I'll also note that you seem to be using KASAN and are running in a
virtual machine. It's not inconceivable that's contributing to your
problems. KASAN makes things _a lot_ slower and a VM may be getting
its time stolen by the host.
-Doug
Powered by blists - more mailing lists