linux-kernel - Re: [PATCH] watchdog: wdat_wdt: Set the min and max timeout values properly

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <5ee31cd1-76af-dae7-0902-3808a2696754@roeck-us.net>
Date:   Mon, 19 Sep 2022 05:54:33 -0700
From:   Guenter Roeck <linux@...ck-us.net>
To:     Jean Delvare <jdelvare@...e.de>
Cc:     linux-watchdog@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>,
        Wim Van Sebroeck <wim@...ux-watchdog.org>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Subject: Re: [PATCH] watchdog: wdat_wdt: Set the min and max timeout values
 properly

On 9/19/22 02:33, Jean Delvare wrote:
> Hi Guenter,
> 
> A few questions from an old discussion:
> 
> On Mon, 8 Aug 2022 04:36:42 -0700, Guenter Roeck wrote:
>> On 8/5/22 15:07, Jean Delvare wrote:
>>> To be honest, I'm not sold to the idea of a software-emulated
>>> maximum timeout value above what the hardware can do, but if doing
>>> that makes sense in certain situations, then I believe it should be
>>> implemented as a boolean flag (named emulate_large_timeout, for
>>> example) to complement max_timeout instead of a separate time value.
>>> Is there a reason I'm missing, why it was not done that way?
>>
>> There are watchdogs with very low maximum timeout values, sometimes less than
>> 3 seconds. gpio-wdt is one example - some have a maximum value of 2.5 seconds.
>> rzn1_wd is even more extreme with a maximum of 1 second. With such low values,
>> accuracy is important, second-based limits are insufficient, and there is an
>> actual need for software timeout handling on top of hardware.
> 
> Out of curiosity, what prevents user-space itself from pinging
> /dev/watchdog every 0.5 second? I assume hardware using such watchdog
> devices is "special" and would be running finely tuned user-space, so
> the process pinging /dev/watchdog could be given higher priority or
> even real-time status to ensure it runs without delays. Is that not
> sufficient?
> 

It took us forever to get the in-kernel support stable, using the right timers
and making sure that the kernel actually executes the code fast enough. Maybe
that would work nowadays from a userspace process with the right permissions,
but I would not trust it. Then there is watchdog support in, for example,
systemd. I would not want to force users to run systemd as high priority
real-time process just to make an odd watchdog work. I also would not want to
tell people that they must not use the systemd watchdog timer to make their
watchdog work.

Also, there is no guarantee that the odd hardware with the weird watchdog hardware
is actually always used in an application where such a fast timeout is needed or
even wanted.

On top of that, the code in the kernel also now supports "ping until opened"
for systems where the watchdog is already running when the system boots.

Overall, I don't think it would be a good idea to revert the in-kernel support
of pinging watchdogs.

>> At the same time, there is actually a need to make timeouts milli-second based
>> instead of second-based, for uses such as medical devices where timeouts need
>> to be short and accurate. The only reason for not implementing this is that
>> the proposals I have seen so far (including mine) were too messy for my liking,
>> and I never had the time to clean it up. Reverting milli-second support would
>> be the completely wrong direction.
> 
> I might look into this at some point (for example as a SUSE Hackweek
> project). Did you post your work somewhere? I'd like to take a look.
> 
There was one submission from someone else if I recall correctly, but mine never
got to the point where it was submittable.

Guenter