linux-kernel - Re: [PATCH 1/3] ARM: rockchip: fix the CPU soft reset

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 05 Jun 2015 11:46:21 +0200
From:	Heiko Stübner <heiko@...ech.de>
To:	Caesar Wang <wxt@...k-chips.com>
Cc:	dianders@...omium.org, Dmitry Torokhov <dmitry.torokhov@...il.com>,
	linux-rockchip@...ts.infradead.org,
	Russell King <linux@....linux.org.uk>,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] ARM: rockchip: fix the CPU soft reset

Am Freitag, 5. Juni 2015, 17:00:02 schrieb Caesar Wang:
> Heiko,
> 
> 在 2015年06月05日 16:45, Heiko Stübner 写道:
> > Hi Caesar,
> > 
> > 
> > thanks for investigating this.
> > 
> > Am Freitag, 5. Juni 2015, 12:47:55 schrieb Caesar Wang:
> >> In general, the correct flow is:
> >> 
> >> cpu off:
> >>      reset_control_assert
> >>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
> >> 
> >> cpu on:
> >>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> >>      reset_control_deassert
> >> 
> >> You can repro it with bringing CPU up and down.
> >> Says:(test scripts)
> >> 
> >>      cd /sys/devices/system/cpu/
> >>      for i in $(seq 1000); do
> >>      echo "================= $i ============"
> >>      
> >>          for j in $(seq 100); do
> >>          
> >>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> >> 
> >> cpu3/online)" != "000" ]]; do echo 0 > cpu1/online
> >> 
> >>                  echo 0 > cpu2/online
> >>                  echo 0 > cpu3/online
> >>              
> >>              done
> >>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> >> 
> >> cpu3/online)" != "111" ]]; do echo 1 > cpu1/online
> >> 
> >>                  echo 1 > cpu2/online
> >>                  echo 1 > cpu3/online
> >>              
> >>              done
> >>          
> >>          done
> >>      
> >>      done
> >> 
> >> The following is reproducile log:
> >> [34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
> >> [34466.186824] Disabling non-boot CPUs ...
> >> [34466.187509] CPU1: shutdown
> >> [34466.188672] CPU2: shutdown
> >> [34473.736627] Kernel panic - not syncing: Watchdog detected hard LOCKUP
> >> on
> >> cpu 0 [34473.736646] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.14.0 #1
> >> [34473.736687] [<c010dfb0>] (unwind_backtrace) from [<c010a460>]
> >> (show_stack+0x20/0x24) [34473.736711] [<c010a460>] (show_stack) from
> >> [<c0630670>] (dump_stack+0x70/0x8c) [34473.736731] [<c0630670>]
> >> (dump_stack) from [<c062fa0c>] (panic+0xa8/0x1fc) [34473.736754]
> >> [<c062fa0c>] (panic) from [<c0191efc>] (watchdog_timer_fn+0x234/0x26c)
> >> [34473.736777] [<c0191efc>] (watchdog_timer_fn) from [<c014103c>]
> >> (__run_hrtimer+0x118/0x1e0) [34473.736797] [<c014103c>] (__run_hrtimer)
> >> from [<c0141c04>] (hrtimer_interrupt+0x148/0x2a0) [34473.736820]
> >> [<c0141c04>] (hrtimer_interrupt) from [<c04d9994>]
> >> (arch_timer_handler_phys+0x38/0x48) [34473.736844] [<c04d9994>]
> >> (arch_timer_handler_phys) from [<c016acc4>]
> >> (handle_percpu_devid_irq+0xb8/0x124) [34473.736867] [<c016acc4>]
> >> (handle_percpu_devid_irq) from [<c0166a30>]
> >> (generic_handle_irq+0x30/0x40)
> >> [34473.736887] [<c0166a30>] (generic_handle_irq) from [<c0166d28>]
> >> (__handle_domain_irq+0x8c/0xb0) [34473.736905] [<c0166d28>]
> >> (__handle_domain_irq) from [<c01003a0>] (gic_handle_irq+0x48/0x6c)
> >> [34473.736922] [<c01003a0>] (gic_handle_irq) from [<c010b040>]
> >> (__irq_svc+0x40/0x50) [34473.736936] Exception stack(0xee127f70 to
> >> 0xee127fb8)
> >> [34473.736948] 7f60:                                     ffffffed
> >> 00000000
> >> 2dd6d000 00000000 [34473.736964] 7f80: ee126000 00000015 c0b46bac
> >> c0b46bac
> >> 0000406a 410fc0d1 00000000 ee127fc4 [34473.736979] 7fa0: ee127fb8
> >> ee127fb8
> >> c0107038 c010703c 600f0013 ffffffff [34473.736995] [<c010b040>]
> >> (__irq_svc)
> >> from [<c010703c>] (arch_cpu_idle+0x40/0x48) [34473.737013] [<c010703c>]
> >> (arch_cpu_idle) from [<c0166978>] (cpu_startup_entry+0x170/0x1d0)
> >> [34473.737031] [<c0166978>] (cpu_startup_entry) from [<c010c254>]
> >> (secondary_start_kernel+0x138/0x160) [34473.737059] [<c010c254>]
> >> (secondary_start_kernel) from [<00100464>] (0x100464) [34474.903740] SMP:
> >> failed to stop secondary CPUs
> >> [34476.099964] SMP: failed to stop secondary CPUs
> >> ...
> >> 
> >> Signed-off-by: Caesar Wang <wxt@...k-chips.com>
> >> ---
> >> 
> >>   arch/arm/mach-rockchip/platsmp.c | 13 +++++--------
> >>   1 file changed, 5 insertions(+), 8 deletions(-)
> >> 
> >> diff --git a/arch/arm/mach-rockchip/platsmp.c
> >> b/arch/arm/mach-rockchip/platsmp.c index 5b4ca3c..1230d3d 100644
> >> --- a/arch/arm/mach-rockchip/platsmp.c
> >> +++ b/arch/arm/mach-rockchip/platsmp.c
> >> @@ -88,20 +88,17 @@ static int pmu_set_power_domain(int pd, bool on)
> >> 
> >>   			return PTR_ERR(rstc);
> >>   		
> >>   		}
> >> 
> >> -		if (on)
> >> +		if (on) {
> >> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> >> 
> >>   			reset_control_deassert(rstc);
> >> 
> >> -		else
> >> +		} else {
> >> 
> >>   			reset_control_assert(rstc);
> >> 
> >> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> >> +		}
> > 
> > you're loosing the return value of regmap_update_bits here, I guess it
> > should look like below?
> > 
> > 		if (on) {
> > 		
> > 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> 
> I search the kernel, Nobody do that return value.

counter-example:
http://lxr.free-electrons.com/source/drivers/phy/phy-exynos5250-sata.c#L76
;-)

> Need we are that?
> I will fix it on next patch if it's need.

yes, while not very probable, it is nevertheless still good practice to handle 
it. But when chaning the logic as we observed below, the regmap update part 
can already stay how it is now, as you can simply do something like:

if (!on)
	reset_control_assert(rstc);

regmap_update_bits
wait_for_pd

if (on)
	reset_control_deassert(rstc);


Heiko

> > 			if (ret < 0) {
> > 			
> > 				pr_err("%s: could not update power domain\n", __func__);
> > 				
> > 		  		reset_control_put(rstc);
> > 				
> > 				return ret;
> > 			
> > 			}
> > 			
> > 			reset_control_deassert(rstc);
> > 		
> > 		} else {
> > 		
> > 			reset_control_assert(rstc);
> > 			
> > 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> > 			if (ret < 0) {
> > 			
> > 				pr_err("%s: could not update power domain\n", __func__);
> > 				
> > 		  		reset_control_put(rstc);
> > 				
> > 				return ret;
> > 			
> > 			}
> > 		
> > 		}
> > 		
> >>   		reset_control_put(rstc);
> >>   	
> >>   	}
> >> 
> >> -	ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> >> -	if (ret < 0) {
> >> -		pr_err("%s: could not update power domain\n", __func__);
> >> -		return ret;
> >> -	}
> >> -
> >> 
> >>   	ret = -1;
> >>   	while (ret != on) {
> >>   	
> >>   		ret = pmu_power_domain_is_on(pd);
> > 
> > second question - with this patch, what happens actually is
> > 
> >   cpu off:
> >       reset_control_assert
> >       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
> >       wait_for_power_domain_to_turn_off
> >   
> >   cpu on:
> >       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> >       reset_control_deassert
> >       wait_for_power_domain_to_turn_on
> > 
> > So shouldn't the deassertion of the reset happen after the powerdomain
> > sucessfull turned on? Like
> > 
> >   cpu on:
> >       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> >       wait_for_power_domain_to_turn_on
> >       reset_control_deassert
> 
> You are right.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/