lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAErSpo51_q9nB+eU8VFmz-LJf16KFeKxKR_TWRWDQ4P9gOnP6g@mail.gmail.com>
Date:	Tue, 12 Jul 2011 11:22:33 -0600
From:	Bjorn Helgaas <bhelgaas@...gle.com>
To:	Pádraig Brady <P@...igbrady.com>
Cc:	Valdis.Kletnieks@...edu, Wim Van Sebroeck <wim@...ana.be>,
	linux-kernel@...r.kernel.org, linux-watchdog@...r.kernel.org
Subject: Re: [PATCH 1/2] watchdog: iTCO_wdt: optionally leave watchdog enabled
 during restart

On Thu, Jul 7, 2011 at 9:53 AM, Pádraig Brady <P@...igbrady.com> wrote:
> On 07/07/11 16:39, Valdis.Kletnieks@...edu wrote:
>> On Wed, 06 Jul 2011 10:09:36 MDT, Bjorn Helgaas said:
>>
>>> If we reboot via BIOS, BIOS should disable the watchdog itself, so this
>>> shouldn't cause unintended resets, even if the user interrupts the boot.
>>
>> Yes, but didn't Linus say something about BIOS code authors being
>> crack-addicted monkeys? :)

I shouldn't have written anything about what BIOS "should" do.  That's
not very useful because, as you suggest, there is room for variation
there.

The risk I was alluding to was this:
  - User boots with "reboot_timeout=X"
  - User reboots normally (non-kexec)
  - BIOS does some reinitialization
  - Machine doesn't autoboot, e.g., because user interrupted boot
  - Watchdog resets machine -- this may be unexpected by the user

On the machines I tested, the unexpected reset doesn't happen because
the BIOS reinit includes disabling the watchdog.  But obviously, that
depends on BIOS details, so there's no guarantee.

I should have just written something along the lines of:

  The reboot_timeout option is intended for kexec reboots, which do
not involve BIOS.
  In this case, the reboot_timeout covers the interval between shutdown of the
  watchdog driver in the old kernel and startup of the driver in the new kernel.

  For normal reboots (via the BIOS), the behavior depends on the BIOS
implementation.
  Some BIOSes disable the watchdog timer, so the reboot_timeout only covers the
  interval until the BIOS disable.  Others leave the timer running, so
the reboot_timeout
  may cause a reset if the machine doesn't autoboot, e.g., if the user
interrupts the boot.

I think the *option* of using a reboot_timeout is still useful,
especially in clusters of unattended machines where it's expensive to
deal with boot failures.

> Yes as I said in a round about way in another mail,
> one can't depend on that at all.
> Some reset, some don't, some behave weirdly,
> iTCO is unusual as kernel resets early at boot, ...

You mentioned an unexplained iTCO reset in your other mail.  That
sounds like a kernel or iTCO_wdt bug, but I think it's unrelated to
this patch.

> If using this, one would have to set the timeout large enough,
> to encompass a full reboot

Right.  In the case of iTCO, I think the range is up to about 10
minutes, which is enough in my case (things like fsck may take longer,
but that's OK as long as the watchdog driver is built in statically).

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ