lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160115141217.GF5783@n2100.arm.linux.org.uk>
Date:	Fri, 15 Jan 2016 14:12:17 +0000
From:	Russell King - ARM Linux <linux@....linux.org.uk>
To:	Grygorii Strashko <grygorii.strashko@...com>
Cc:	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>, nm@...com,
	Keerthy <a0393675@...com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	linux-pm@...r.kernel.org, peterz@...radead.org,
	Keerthy <j-keerthy@...com>, linux-kernel@...r.kernel.org,
	josh@...htriplett.org, edubezval@...il.com, joel@....id.au,
	mpe@...erman.id.au, akpm@...ux-foundation.org,
	linux-omap@...r.kernel.org, dyoung@...hat.com,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v2] reboot: Backup orderly_poweroff

On Fri, Jan 15, 2016 at 03:29:04PM +0200, Grygorii Strashko wrote:
> Seems ARM doesn't have endless loop implemented in machine_power_off() - so,
> not too much chances for Watchdog to fire.
> void machine_power_off(void)
> {
> 	local_irq_disable();
> 	smp_send_stop();
> 
> 	if (pm_power_off)
> 		pm_power_off();
> 
> 	--- endless loop ?
> 	--- or restart ?
> }
> [and even if it will be there - 20-30sec is usual timeout for Watchdog
> and this enough time to burn the system in case of thermal emergency
> poweroff :(]

I covered this in my reply to Ingo yesterday.  The result is that a
failed or unimplemented call drops through to do_exit(0) on behalf of
the calling process, terminating that process.  However, as I said
in that same email, I don't think you're getting anywhere near this
code.

> That's true - original log [1] has 
> Nov 30 11:19:22 [    5.942769] thermal thermal_zone3: critical temperature reached(108 C),shutting down
> [...]
> Nov 30 11:19:24 [    7.387900] ahci 4a140000.sata: flags: 64bit ncq sntf stag pm led clo only pmp pio slum part ccc apst 
> Nov 30 11:19:24 INIT: Switching to runlevel: 0
> Nov 30 11:19:24 INIT: Sending processes the TERM signal
> 
> and there are no
> [  220.004522] reboot: Power down

Right, so things are stuck in userspace, which means the system is still
in an active runnable state.

As I mentioned (again) in my email, the issue appears to be that the 'rc'
script is stuck waiting on a FIFO.

The init daemon is trying to do an orderly shutdown.  As part of that,
it's executing the 'rc' script, which in systems I've seen, runs through
a set of scripts in the /etc/rc?.d directory in order, which normally
bring up or take down services and perform other sequenced actions.

If this script hangs (as it seems to be doing) it won't get to running
/sbin/poweroff or similar, and that means machine_power_off() won't be
called.

> In general, this kind of use case can be simulated using SysRq on any arch
> - [3.290034] Freeing unused kernel memory: 492K (c0a67000 - c0ae2000)
>   INIT: version 2.88 booting
>   Starting udev
> ^^ The issue most probably might happens when system in the process of
> loading modules
> So, once modules loading process is started - fire Sysrq "poweroff(o)"

This suggests it could be a udev issue - but without knowing what's
happening inside sysvinit's scripts, it's hard to know for certain.
Adding some debug to the 'rc' script (make sure it works without
rebooting or changing the run level, or have a way of restoring the
file if it fails to boot) so that it's possible to see what it's doing
may be a good idea - the simplest approach may be to just add

set -x

towards the top of the file - which will make it very noisy.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ