lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 24 Aug 2012 16:23:39 +0800
From:	Bill Huang <bilhuang@...dia.com>
To:	"'linux-tegra@...r.kernel.org'" <linux-tegra@...r.kernel.org>
CC:	"'linux-arm-kernel@...ts.infradead.org'" 
	<linux-arm-kernel@...ts.infradead.org>,
	"'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>
Subject: Shutdown problem in SMP system happened on Tegra20

Hi,

When doing shutdown on Tegra20/Tegra30, we need to read/write PMIC registers through I2C
to perform the power off sequence. Unfortunately, sometimes we'll fail to shutdown
due to I2C timeout on Tegra20. And the cause of the timeout is due to the CPU which I2C
controller IRQ affined to will have chance to be offlined without migrating all irqs affined 
to it, so the following I2C transactions will fail (no any CPU will handle that interrupt
since then).

Some snippet of the shutdown codes:

void kernel_power_off(void)
{
	kernel_shutdown_prepare(SYSTEM_POWER_OFF);
	:
	disable_nonboot_cpus();
	:
	machine_power_off();
}

void machine_power_off(void)
{
	machine_shutdown();
	if (pm_power_off)
		pm_power_off(); /* this is where we send I2C write to shutdown */
}

void machine_shutdown(void)
{
#ifdef CONFIG_SMP
	smp_send_stop();
#endif
}

In "smp_send_stop()", it will send "IPI_CPU_STOPS" to offline other cpus except
current cpu (smp_processor_id()), however, current cpu will not always be cpu0 at
least at Tegra20, that said for example cpu1 might be the current cpu and cpu0 will
be offlined and this is the case why the I2C transaction will timeout. 

For normal case, "disable_nonboot_cpus()" call will disable all other Cpus except
cpu0, that means we won't hit the problem mentioned here since cpu0 will always be
the current cpu in the call "smp_send_stop", but the call to "disable_nonboot_cpus" 
will happen only when "CONFIG_PM_SLEEP_SMP" is enabled which is not the case for
Tegra20/Tegra30, we don't support suspend yet so this can't be enabled.

There are two known fix for this, the first one is enable suspend (ARCH_SUSPEND_POSSIBLE)
so the cpu0 will be the only online cpu while doing "machine_shutdown". The second
fix is adding call to "migrate_irqs()" in "ipi_cpu_stop" so all irqs can be migrated to
the active cpu.

Could someone familiar with the ARM SMP design help to answer my two questions?

1. Does it make sense that "smp_processor_id()" could be non-cpu0 in the call
   "smp_send_stop()"? In Tegra30 it will always be cpu0 but Tegra20 will be 50-50,
   I just can't find the magic.

2. If current cpu is not necessarily be cpu0 in the call "smp_send_stop()", then
   does it make sense to add "migrate_irqs()" in "ipi_cpu_stop()"? Or is there any
   other fix which makes more sense?

Thanks,
Bill
nvpublic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists