[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87tt3trlhw.ffs@tglx>
Date: Thu, 03 Jul 2025 23:51:39 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Himanshu Madhani <himanshu.madhani@...cle.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: System hang with latest kernel v6.16.0-rc1 (rc2 & rc3)
On Thu, Jul 03 2025 at 20:34, Himanshu Madhani wrote:
>> On Jul 3, 2025, at 13:21, Thomas Gleixner <tglx@...utronix.de> wrote:
>> The problem is obvious and if you would have enabled
>> CONFIG_PROVE_LOCKING then you would have got the reason presented on a
>> silver tablet in dmesg. I encourage you to do so nevertheless.
>>
> Great tip on this. I’ll keep that in mind for future debugging efforts.
Actually the very first thing in testing of a new kernel should be to
run it with a copious amount of debug options. That avoids all the
headaches of chasing fallout caught by them, in painful ways later.
I'm truly surprised that this is not done already and testing blindly
assumes that rc1 has already been objected to such tests completely.
It's bloody obvious that with a code base of the complexity of the
kernel and the gazillion of drivers, the CI coverage is far from
complete and only best effort based.
Obviously the companies, who have access to and care about their
specialized hardware, should run CI against linux-next to begin
with. Then such problems would be caught way before they hit Linus tree.
I know there is no budget for this kind of effort. Companies rather
waste their budget on chasing problems, which could have been avoided
upfront. That's a huge cost saving, which is proven by applying magic to
the relevant Excel-sh*ts.
Not your decision, I know.
Shrug,
tglx
Powered by blists - more mailing lists