linux-kernel - Re: [PATCH 1/3] ARM: rockchip: fix the CPU soft reset

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55716512.9050603@rock-chips.com>
Date:	Fri, 05 Jun 2015 17:00:02 +0800
From:	Caesar Wang <wxt@...k-chips.com>
To:	Heiko Stübner <heiko@...ech.de>
CC:	dianders@...omium.org, Dmitry Torokhov <dmitry.torokhov@...il.com>,
	linux-rockchip@...ts.infradead.org,
	Russell King <linux@....linux.org.uk>,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] ARM: rockchip: fix the CPU soft reset

Heiko,

在 2015年06月05日 16:45, Heiko Stübner 写道:
> Hi Caesar,
>
>
> thanks for investigating this.
>
> Am Freitag, 5. Juni 2015, 12:47:55 schrieb Caesar Wang:
>> In general, the correct flow is:
>>
>> cpu off:
>>      reset_control_assert
>>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
>>
>> cpu on:
>>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
>>      reset_control_deassert
>>
>> You can repro it with bringing CPU up and down.
>> Says:(test scripts)
>>
>>      cd /sys/devices/system/cpu/
>>      for i in $(seq 1000); do
>>      echo "================= $i ============"
>>          for j in $(seq 100); do
>>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
>> cpu3/online)" != "000" ]]; do echo 0 > cpu1/online
>>                  echo 0 > cpu2/online
>>                  echo 0 > cpu3/online
>>              done
>>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
>> cpu3/online)" != "111" ]]; do echo 1 > cpu1/online
>>                  echo 1 > cpu2/online
>>                  echo 1 > cpu3/online
>>              done
>>          done
>>      done
>>
>> The following is reproducile log:
>> [34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
>> [34466.186824] Disabling non-boot CPUs ...
>> [34466.187509] CPU1: shutdown
>> [34466.188672] CPU2: shutdown
>> [34473.736627] Kernel panic - not syncing: Watchdog detected hard LOCKUP on
>> cpu 0 [34473.736646] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.14.0 #1
>> [34473.736687] [<c010dfb0>] (unwind_backtrace) from [<c010a460>]
>> (show_stack+0x20/0x24) [34473.736711] [<c010a460>] (show_stack) from
>> [<c0630670>] (dump_stack+0x70/0x8c) [34473.736731] [<c0630670>]
>> (dump_stack) from [<c062fa0c>] (panic+0xa8/0x1fc) [34473.736754]
>> [<c062fa0c>] (panic) from [<c0191efc>] (watchdog_timer_fn+0x234/0x26c)
>> [34473.736777] [<c0191efc>] (watchdog_timer_fn) from [<c014103c>]
>> (__run_hrtimer+0x118/0x1e0) [34473.736797] [<c014103c>] (__run_hrtimer)
>> from [<c0141c04>] (hrtimer_interrupt+0x148/0x2a0) [34473.736820]
>> [<c0141c04>] (hrtimer_interrupt) from [<c04d9994>]
>> (arch_timer_handler_phys+0x38/0x48) [34473.736844] [<c04d9994>]
>> (arch_timer_handler_phys) from [<c016acc4>]
>> (handle_percpu_devid_irq+0xb8/0x124) [34473.736867] [<c016acc4>]
>> (handle_percpu_devid_irq) from [<c0166a30>] (generic_handle_irq+0x30/0x40)
>> [34473.736887] [<c0166a30>] (generic_handle_irq) from [<c0166d28>]
>> (__handle_domain_irq+0x8c/0xb0) [34473.736905] [<c0166d28>]
>> (__handle_domain_irq) from [<c01003a0>] (gic_handle_irq+0x48/0x6c)
>> [34473.736922] [<c01003a0>] (gic_handle_irq) from [<c010b040>]
>> (__irq_svc+0x40/0x50) [34473.736936] Exception stack(0xee127f70 to
>> 0xee127fb8)
>> [34473.736948] 7f60:                                     ffffffed 00000000
>> 2dd6d000 00000000 [34473.736964] 7f80: ee126000 00000015 c0b46bac c0b46bac
>> 0000406a 410fc0d1 00000000 ee127fc4 [34473.736979] 7fa0: ee127fb8 ee127fb8
>> c0107038 c010703c 600f0013 ffffffff [34473.736995] [<c010b040>] (__irq_svc)
>> from [<c010703c>] (arch_cpu_idle+0x40/0x48) [34473.737013] [<c010703c>]
>> (arch_cpu_idle) from [<c0166978>] (cpu_startup_entry+0x170/0x1d0)
>> [34473.737031] [<c0166978>] (cpu_startup_entry) from [<c010c254>]
>> (secondary_start_kernel+0x138/0x160) [34473.737059] [<c010c254>]
>> (secondary_start_kernel) from [<00100464>] (0x100464) [34474.903740] SMP:
>> failed to stop secondary CPUs
>> [34476.099964] SMP: failed to stop secondary CPUs
>> ...
>>
>> Signed-off-by: Caesar Wang <wxt@...k-chips.com>
>> ---
>>
>>   arch/arm/mach-rockchip/platsmp.c | 13 +++++--------
>>   1 file changed, 5 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/mach-rockchip/platsmp.c
>> b/arch/arm/mach-rockchip/platsmp.c index 5b4ca3c..1230d3d 100644
>> --- a/arch/arm/mach-rockchip/platsmp.c
>> +++ b/arch/arm/mach-rockchip/platsmp.c
>> @@ -88,20 +88,17 @@ static int pmu_set_power_domain(int pd, bool on)
>>   			return PTR_ERR(rstc);
>>   		}
>>
>> -		if (on)
>> +		if (on) {
>> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
>>   			reset_control_deassert(rstc);
>> -		else
>> +		} else {
>>   			reset_control_assert(rstc);
>> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
>> +		}
> you're loosing the return value of regmap_update_bits here, I guess it should
> look like below?
>
> 		if (on) {
> 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);

I search the kernel, Nobody do that return value.
Need we are that?
I will fix it on next patch if it's need.
> 			if (ret < 0) {
> 				pr_err("%s: could not update power domain\n", __func__);
> 		  		reset_control_put(rstc);
> 				return ret;
> 			}
>
> 			reset_control_deassert(rstc);
> 		} else {
> 			reset_control_assert(rstc);
>
> 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> 			if (ret < 0) {
> 				pr_err("%s: could not update power domain\n", __func__);
> 		  		reset_control_put(rstc);
> 				return ret;
> 			}
> 		}
>
>
>>   		reset_control_put(rstc);
>>   	}
>>
>> -	ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
>> -	if (ret < 0) {
>> -		pr_err("%s: could not update power domain\n", __func__);
>> -		return ret;
>> -	}
>> -
>>   	ret = -1;
>>   	while (ret != on) {
>>   		ret = pmu_power_domain_is_on(pd);
> second question - with this patch, what happens actually is
>
>   cpu off:
>       reset_control_assert
>       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
>       wait_for_power_domain_to_turn_off
>   
>   cpu on:
>       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
>       reset_control_deassert
>       wait_for_power_domain_to_turn_on
>
> So shouldn't the deassertion of the reset happen after the powerdomain
> sucessfull turned on? Like
>
>   cpu on:
>       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
>       wait_for_power_domain_to_turn_on
>       reset_control_deassert

You are right.
>
>
>

-- 
Thanks,
- Caesar


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/