lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Jan 2019 19:27:31 -0700
From:   Jerry Hoemann <jerry.hoemann@....com>
To:     Ivan Mironov <mironov.ivan@...il.com>
Cc:     linux-watchdog@...r.kernel.org, linux-kernel@...r.kernel.org,
        Wim Van Sebroeck <wim@...ux-watchdog.org>,
        Guenter Roeck <linux@...ck-us.net>
Subject: Re: [RFC PATCH 1/4] watchdog: hpwdt: Don't disable watchdog on NMI

On Mon, Jan 14, 2019 at 07:36:14AM +0500, Ivan Mironov wrote:
> Existing code disables watchdog on NMI right before completely hanging
> the system.
> 
> There are two problems here:
> 
>  * First, watchdog is expected to reset the system in a case of such
>    failure, no matter what.

Documentation/watchdog/watchdog-api.txt

explicitly allows for pretimeout NMI and generation of kernel crash dumps.

By removing hpwdt_stop the system will likely fail to crash dump
as there is only 9 seconds between receipt of a NMI and the iLO
resetting the system.

Unfortunately, kdump is not without issues and can also be difficult
to properly configure either of which can result in failure to dump
and reset.

Customers who value availability over kdump collection, the pretimeout
NMI can be disabled and hardware will not issue the pretimeout NMI
and will only do reset.

A middle ground for those who want tombstones but not kdump, would
be to leave the pretimeout NMI enabled and add "panic=N" to the
Linux command line.  That way after the panic, the tombstone is
printed and the system resets after N seconds.



>  * Second, this code has no effect if there are more than one watchdog.

That is correct.  Hpwdt will not turn off any other WDT.

I don't see a current method of notifying other watchdogs
that a given watchdog is going to take the system down.

The closest I hook see is watchdog_notify_pretimeout, but I don't
see that notifying other WDT.  Its not clear to me that it should.
(e.g. the second WDT could be of longer duration and protect against
kdump hanging. This would need to be thought through.)



> 
> Signed-off-by: Ivan Mironov <mironov.ivan@...il.com>
> ---
>  drivers/watchdog/hpwdt.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
> index ef30c7e9728d..2467e6bc25c2 100644
> --- a/drivers/watchdog/hpwdt.c
> +++ b/drivers/watchdog/hpwdt.c
> @@ -170,8 +170,6 @@ static int hpwdt_pretimeout(unsigned int ulReason, struct pt_regs *regs)
>  	if (ilo5 && !pretimeout && !mynmi)
>  		return NMI_DONE;
>  
> -	hpwdt_stop();
> -
>  	hex_byte_pack(panic_msg, mynmi);
>  	nmi_panic(regs, panic_msg);
>  
> -- 
> 2.20.1

-- 

-----------------------------------------------------------------------------
Jerry Hoemann                  Software Engineer   Hewlett Packard Enterprise
-----------------------------------------------------------------------------

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ