lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <k2t3dkh3acoenhxtsd3ekvpnwl5yir6qaun52h5prdfwcx5lsb@h3ieoj7jfu6t>
Date: Thu, 3 Jul 2025 12:12:58 +0200
From: Thierry Reding <thierry.reding@...il.com>
To: Jon Hunter <jonathanh@...dia.com>
Cc: Kartik Rajput <kkartik@...dia.com>, daniel.lezcano@...aro.org, 
	tglx@...utronix.de, linux-kernel@...r.kernel.org, linux-tegra@...r.kernel.org
Subject: Re: [PATCH] clocksource: timer-tegra186: Enable WDT at probe

On Thu, Jul 03, 2025 at 08:55:04AM +0100, Jon Hunter wrote:
> 
> 
> On 03/07/2025 07:55, Thierry Reding wrote:
> > On Mon, Jun 30, 2025 at 04:31:35PM +0530, Kartik Rajput wrote:
> > > Currently, if the system crashes or hangs during kernel boot before
> > > userspace initializes and configures the watchdog timer, then the
> > > watchdog won’t be able to recover the system as it’s not running. This
> > > becomes crucial during an over-the-air update, where if the newly
> > > updated kernel crashes on boot, the watchdog is needed to reset the
> > > device and boot into an alternative system partition. If the watchdog
> > > is disabled in such scenarios, it can lead to the system getting
> > > bricked.
> > > 
> > > Enable the WDT during driver probe to allow recovery from any crash/hang
> > > seen during early kernel boot. Also, disable interrupts once userspace
> > > starts pinging the watchdog.
> > > 
> > > Signed-off-by: Kartik Rajput <kkartik@...dia.com>
> > > ---
> > >   drivers/clocksource/timer-tegra186.c | 42 ++++++++++++++++++++++++++++
> > >   1 file changed, 42 insertions(+)
> > 
> > This seems dangerous to me. It means that if the operating system
> > doesn't start some sort of watchdog service in userspace that pings the
> > watchdog, the system will reboot 120 seconds after the watchdog probe.
> 
> 
> I don't believe that will happen with this change. The kernel will continue
> to pet the watchdog until userspace takes over with this change. At least
> that is my understanding.

Ah yes... I skipped over that IRQ handling bit. However, I think this
still violates the assumptions because the driver will keep petting the
watchdog no matter what, which means that we now have no way of forcing
a reset of the system when userspace hangs. As long as just a tiny part
of the kernel keeps running, the watchdog would keep getting petted and
prevent it from resetting the system.

Using a second watchdog still seems like a more robust alternative. Or
maybe we can find a way to remove the kernel petting once userspace
starts the watchdog.

Thierry

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ