linux-kernel - Re: [PATCH v3 0/2] Reduce CPU consumption after panic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250430084852.GN4198@noisy.programming.kicks-ass.net>
Date: Wed, 30 Apr 2025 10:48:52 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Carlos Bilbao <bilbao@...edu>
Cc: Andrew Morton <akpm@...ux-foundation.org>, carlos.bilbao@...nel.org,
	tglx@...utronix.de, seanjc@...gle.com, jan.glauber@...il.com,
	pmladek@...e.com, jani.nikula@...el.com,
	linux-kernel@...r.kernel.org, gregkh@...uxfoundation.org,
	takakura@...inux.co.jp, john.ogness@...utronix.de, x86@...nel.org
Subject: Re: [PATCH v3 0/2] Reduce CPU consumption after panic

On Tue, Apr 29, 2025 at 03:52:05PM -0500, Carlos Bilbao wrote:
> Hello,
> 
> On 4/29/25 17:10, Peter Zijlstra wrote:
> > On Tue, Apr 29, 2025 at 03:32:56PM -0500, Carlos Bilbao wrote:
> > 
> >> Yes, the machine is effectively dead, but as things stand today,
> >> it's still drawing resources unnecessarily.
> >>
> >> Who cares? An example, as mentioned in the cover letter, is Linux running
> > 
> > Ah, see, I didn't have no cover letter, only akpm's reply.
> > 
> >> in VMs. Imagine a scenario where customers are billed based on CPU usage --
> >> having panicked VMs spinning in useless loops wastes their money. In shared
> >> envs, those wasted cycles could be used by other processes/VMs. But this
> >> is as much about the cloud as it is for laptops/embedded/anywhere -- Linux
> >> should avoid wasting resources wherever possible.
> > 
> > So I don't really buy the laptop and embedded case, people tend to look
> > at laptops when open, and get very impatient when they don't respond.
> > Embedded things really should have a watchdog.
> > 
> > Also, should you not be using panic_timeout to auto reboot your machine
> > in all these cases?
> > 
> > In any case, the VM nonsense, do they not have a virtual watchdog to
> > 'reap' crashed VMs or something?
> 
> The key word here is "should." Should embedded systems have a watchdog?
> Maybe. Should I've auto reboot set? Maybe. Perhaps I don’t want to reboot
> until I’ve root-caused the crash.

Install a kdump kernel, or log your serial line :-)

> But my patch set isn’t about “shoulds.”
> What I’m discussing here is (1) the default Linux behavior, 

Well, the default behaviour works for the 'your own physical machine'
thing just fine -- and that has always been the default use-case.

Nobody is going to be sitting there staring at a panic screen for ages.

All the other weirdo cases like embedded and VMs, they're just that,
weirdos and they can keep their pieces :-)

> and (2)
> providing people with the flexibility to do what THEY think they should do,
> not what you think they should do.

Well, there are a ton of options already. Like said, we have watchdogs,
reboots, crash kernels and all sorts. Why do we need more?

All that said... the default more or less does for(;;) { mdelay(100) },
if you have a modern chip that should not end up using much power at
all. That should end up in delay_halt_tpause() or delay_halt_mwaitx()
(depending on you being on Intel or AMD). And spend most its time in
deep idle states.

Is something not working?