linux-kernel - Re: [PATCH RFC] Watchdog: sbsa

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160504155932.GH13045@dhcppc6.redhat.com>
Date:	Wed, 4 May 2016 21:29:32 +0530
From:	Pratyush Anand <panand@...hat.com>
To:	Timur Tabi <timur@...eaurora.org>
Cc:	Guenter Roeck <linux@...ck-us.net>, fu.wei@...aro.org,
	Suravee.Suthikulpanit@....com, wim@...ana.be,
	linux-arm-kernel@...ts.infradead.org,
	linux-watchdog@...r.kernel.org,
	open list <linux-kernel@...r.kernel.org>,
	Dave Young <dyoung@...hat.com>
Subject: Re: [PATCH RFC] Watchdog: sbsa_gwdt: Enhance timeout range

+Dave

Hi Timur,

On 04/05/2016:09:21:43 AM, Timur Tabi wrote:
> Pratyush Anand wrote:
> >static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
> >{
> >+    struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
> >+    struct watchdog_device *wdd = &gwdt->wdd;
> >+    u64 timeout = (u64)gwdt->clk * wdd->timeout;
> >+
> >+    writeq(timeout + arch_counter_get_cntvct(),
> >+                    gwdt->control_base + SBSA_GWDT_WCV);
> >+
> >      panic(WATCHDOG_NAME " timeout");
> 
> I'm on the fence about this.
> 
> On one hand, I have always opposed the idea that the interrupt handler needs
> to function properly in order for the timeout to be correct.  Fu's original
> patch required this for every timeout.
> 
> The current code, however, only uses the interrupt when action=1.  In this
> case, WCV is only reprogrammed in order to prevent the system from resetting
> during the kexec.  Technically, the watchdog timeout has already been
> handled.

Yes.

> 
> However, this should be unnecessary, because it can't be a problem that's
> unique to the SBSA watchdog.  Every system that kexecs another kernel needs
> to be able to handle a watchdog timeout.  Shouldn't the kexec code already
> ping or disable the watchdog?  We need a cross-platform solution.  Drivers
> should not need to do this.

Its unique to SBSA because you have very little timeout here. kexec-tools
upstream does not have any mechanism to handle watchdog timeout. Lets say even
if we implement a framework there, the best it can do is to ping the watchdog
again. Disabling should not be an option in kexec-tools, because in that case if
kexec-tools or secondary kernel stuck, we won't have a way out.
Now, even if we ping it once in kexec tools, we will have to make sure that
watchdog driver's probe is called before timeout. Therefore, user must have a
way to specify this timeout, so that if a particular kernel take more time to
boot then he can increase the timeout. Given, these variable conditions I do not
see much advantage of implementing it in kexec-tools.

However fedora/rhel kedumpctl mechanism does some  best case correction. It
makes sure that watchdog module is loaded in second kernel if watchdog was
active during first kernel, and loaded as early as possible [1].

~Pratyush

[1] https://github.com/pratyushanand/kexec-tools/commits/watchdog_fmaster