lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <564E602B.6060606@linaro.org>
Date:	Thu, 19 Nov 2015 16:50:03 -0700
From:	Al Stone <al.stone@...aro.org>
To:	Timur Tabi <timur@...eaurora.org>,
	Guenter Roeck <linux@...ck-us.net>, Fu Wei <fu.wei@...aro.org>
Cc:	Pratyush Anand <panand@...hat.com>, devicetree@...r.kernel.org,
	linux-watchdog@...r.kernel.org, Arnd Bergmann <arnd@...db.de>,
	linux-doc@...r.kernel.org, Jon Masters <jcm@...hat.com>,
	Linaro ACPI Mailman List <linaro-acpi@...ts.linaro.org>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	lkml <linux-kernel@...r.kernel.org>,
	Will Deacon <will.deacon@....com>,
	Wim Van Sebroeck <wim@...ana.be>,
	Rob Herring <robherring2@...il.com>,
	Catalin Marinas <catalin.marinas@....com>,
	Wei Fu <tekkamanninja@...il.com>,
	Jonathan Corbet <corbet@....net>,
	Dave Young <dyoung@...hat.com>,
	Vipul Gandhi <vgandhi@...eaurora.org>
Subject: Re: [Linaro-acpi] [PATCH v8 5/5] Watchdog: introduce ARM SBSA
 watchdog driver

Sorry for the delayed response...I've got some difficult family things to work
on IRL that are taking priority...

On 11/12/2015 05:23 PM, Timur Tabi wrote:
> On 11/12/2015 06:06 PM, Al Stone wrote:
>> If it is a NAK, that's fine, but I also want to be sure I understand what the
>> objections are.  Based on my understanding of the discussion so far over the
>> multiple versions, I think the primary objection is that the use of pretimeout
>> makes this driver too complex, and indeed complex enough that there is some
>> concern that it could destabilize a running system.  Do I have that right?
> 
> I don't have a problem with the concept of pre-timeout per se.  My primary
> objection is this code:
> 
>> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>> +{
>> +       struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
>> +       struct watchdog_device *wdd = &gwdt->wdd;
>> +
>> +       /* We don't use pretimeout, trigger WS1 now */
>> +       if (!wdd->pretimeout)
>> +               sbsa_gwdt_set_wcv(wdd, 0);
> 
> This driver depends on an interrupt handler in order to properly program the
> hardware.  Unlike some other devices, the SBSA watchdog does not need assistance
> to reset on a timeout -- it is a "fire and forget" device.  What happens if
> there is a hard lockup, and interrupts no longer work?

Aha.  I see now.  That helps clarify a lot.  Thanks.

> The reason why Fu does this is because he wants to support a pre-timeout value
> that's independent of the timeout value.  The SBSA watchdog is normally
> programmed where real timeout equals twice the pre-timeout.  I would prefer that
> the driver adhere to this limitation.  That would eliminate the need to
> pre-program the hardware in the interrupt handler.

The "normally programmed" limitation described is interesting; forgive my
ignorance, but where is that specified?  I couldn't find anything that specific
in the SBSA, or the ARM ARM, but I could have missed it.  That being said,
keeping them independent at least seems like a good idea; if I think about
kdump/kexec or some other recovery mechanism wanting to perhaps copy part of
RAM or flush a filesystem/database, or maybe do some other magic to recover
enough to be able to reset the timer, that may be a really long interval on a
large server.  I could easily see that being very different from a watchdog
timer that's meant to just make sure the platform is still making progress.
Conversely, I could see that recovery interval being very small or zero on
a guest OS, for example, and the watchdog still different.

>> And finally, a simpler, single stage timeout watchdog driver would be a
>> reasonable thing to accept, yes?  I can see where that would make sense.
> 
> I would be okay with merging such a driver, and then enhancing it later to add
> pre-timeout support.
> 
>> The issue for me in that case is that the SBSA requires a two stage timeout,
>> so a single stage driver has no real value for me.
> 
> There are plenty of existing watchdog devices that have a two-stage timeout but
> the driver treats it as a single stage.  The PowerPC watchdog driver is like
> that.  The hardware is programmed for the second stage to cause a hardware
> reset, and the interrupt handler is typically a no-op or just a printk().
> 

Hrm.  Thanks for the pointer.  I _think_ I see a way to do that with arm64, and
perhaps combine this driver's functionality with what Timur did originally, but
still have it reasonably straightforward.  I need to do the experiments, though,
and see if it actually works first.

-- 
ciao,
al
-----------------------------------
Al Stone
Software Engineer
Linaro Enterprise Group
al.stone@...aro.org
-----------------------------------
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ