lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 05 Jun 2015 11:46:21 +0200
From:	Heiko Stübner <heiko@...ech.de>
To:	Caesar Wang <wxt@...k-chips.com>
Cc:	dianders@...omium.org, Dmitry Torokhov <dmitry.torokhov@...il.com>,
	linux-rockchip@...ts.infradead.org,
	Russell King <linux@....linux.org.uk>,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] ARM: rockchip: fix the CPU soft reset

Am Freitag, 5. Juni 2015, 17:00:02 schrieb Caesar Wang:
> Heiko,
> 
> 在 2015年06月05日 16:45, Heiko Stübner 写道:
> > Hi Caesar,
> > 
> > 
> > thanks for investigating this.
> > 
> > Am Freitag, 5. Juni 2015, 12:47:55 schrieb Caesar Wang:
> >> In general, the correct flow is:
> >> 
> >> cpu off:
> >>      reset_control_assert
> >>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
> >> 
> >> cpu on:
> >>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> >>      reset_control_deassert
> >> 
> >> You can repro it with bringing CPU up and down.
> >> Says:(test scripts)
> >> 
> >>      cd /sys/devices/system/cpu/
> >>      for i in $(seq 1000); do
> >>      echo "================= $i ============"
> >>      
> >>          for j in $(seq 100); do
> >>          
> >>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> >> 
> >> cpu3/online)" != "000" ]]; do echo 0 > cpu1/online
> >> 
> >>                  echo 0 > cpu2/online
> >>                  echo 0 > cpu3/online
> >>              
> >>              done
> >>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> >> 
> >> cpu3/online)" != "111" ]]; do echo 1 > cpu1/online
> >> 
> >>                  echo 1 > cpu2/online
> >>                  echo 1 > cpu3/online
> >>              
> >>              done
> >>          
> >>          done
> >>      
> >>      done
> >> 
> >> The following is reproducile log:
> >> [34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
> >> [34466.186824] Disabling non-boot CPUs ...
> >> [34466.187509] CPU1: shutdown
> >> [34466.188672] CPU2: shutdown
> >> [34473.736627] Kernel panic - not syncing: Watchdog detected hard LOCKUP
> >> on
> >> cpu 0 [34473.736646] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.14.0 #1
> >> [34473.736687] [<c010dfb0>] (unwind_backtrace) from [<c010a460>]
> >> (show_stack+0x20/0x24) [34473.736711] [<c010a460>] (show_stack) from
> >> [<c0630670>] (dump_stack+0x70/0x8c) [34473.736731] [<c0630670>]
> >> (dump_stack) from [<c062fa0c>] (panic+0xa8/0x1fc) [34473.736754]
> >> [<c062fa0c>] (panic) from [<c0191efc>] (watchdog_timer_fn+0x234/0x26c)
> >> [34473.736777] [<c0191efc>] (watchdog_timer_fn) from [<c014103c>]
> >> (__run_hrtimer+0x118/0x1e0) [34473.736797] [<c014103c>] (__run_hrtimer)
> >> from [<c0141c04>] (hrtimer_interrupt+0x148/0x2a0) [34473.736820]
> >> [<c0141c04>] (hrtimer_interrupt) from [<c04d9994>]
> >> (arch_timer_handler_phys+0x38/0x48) [34473.736844] [<c04d9994>]
> >> (arch_timer_handler_phys) from [<c016acc4>]
> >> (handle_percpu_devid_irq+0xb8/0x124) [34473.736867] [<c016acc4>]
> >> (handle_percpu_devid_irq) from [<c0166a30>]
> >> (generic_handle_irq+0x30/0x40)
> >> [34473.736887] [<c0166a30>] (generic_handle_irq) from [<c0166d28>]
> >> (__handle_domain_irq+0x8c/0xb0) [34473.736905] [<c0166d28>]
> >> (__handle_domain_irq) from [<c01003a0>] (gic_handle_irq+0x48/0x6c)
> >> [34473.736922] [<c01003a0>] (gic_handle_irq) from [<c010b040>]
> >> (__irq_svc+0x40/0x50) [34473.736936] Exception stack(0xee127f70 to
> >> 0xee127fb8)
> >> [34473.736948] 7f60:                                     ffffffed
> >> 00000000
> >> 2dd6d000 00000000 [34473.736964] 7f80: ee126000 00000015 c0b46bac
> >> c0b46bac
> >> 0000406a 410fc0d1 00000000 ee127fc4 [34473.736979] 7fa0: ee127fb8
> >> ee127fb8
> >> c0107038 c010703c 600f0013 ffffffff [34473.736995] [<c010b040>]
> >> (__irq_svc)
> >> from [<c010703c>] (arch_cpu_idle+0x40/0x48) [34473.737013] [<c010703c>]
> >> (arch_cpu_idle) from [<c0166978>] (cpu_startup_entry+0x170/0x1d0)
> >> [34473.737031] [<c0166978>] (cpu_startup_entry) from [<c010c254>]
> >> (secondary_start_kernel+0x138/0x160) [34473.737059] [<c010c254>]
> >> (secondary_start_kernel) from [<00100464>] (0x100464) [34474.903740] SMP:
> >> failed to stop secondary CPUs
> >> [34476.099964] SMP: failed to stop secondary CPUs
> >> ...
> >> 
> >> Signed-off-by: Caesar Wang <wxt@...k-chips.com>
> >> ---
> >> 
> >>   arch/arm/mach-rockchip/platsmp.c | 13 +++++--------
> >>   1 file changed, 5 insertions(+), 8 deletions(-)
> >> 
> >> diff --git a/arch/arm/mach-rockchip/platsmp.c
> >> b/arch/arm/mach-rockchip/platsmp.c index 5b4ca3c..1230d3d 100644
> >> --- a/arch/arm/mach-rockchip/platsmp.c
> >> +++ b/arch/arm/mach-rockchip/platsmp.c
> >> @@ -88,20 +88,17 @@ static int pmu_set_power_domain(int pd, bool on)
> >> 
> >>   			return PTR_ERR(rstc);
> >>   		
> >>   		}
> >> 
> >> -		if (on)
> >> +		if (on) {
> >> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> >> 
> >>   			reset_control_deassert(rstc);
> >> 
> >> -		else
> >> +		} else {
> >> 
> >>   			reset_control_assert(rstc);
> >> 
> >> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> >> +		}
> > 
> > you're loosing the return value of regmap_update_bits here, I guess it
> > should look like below?
> > 
> > 		if (on) {
> > 		
> > 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> 
> I search the kernel, Nobody do that return value.

counter-example:
http://lxr.free-electrons.com/source/drivers/phy/phy-exynos5250-sata.c#L76
;-)

> Need we are that?
> I will fix it on next patch if it's need.

yes, while not very probable, it is nevertheless still good practice to handle 
it. But when chaning the logic as we observed below, the regmap update part 
can already stay how it is now, as you can simply do something like:

if (!on)
	reset_control_assert(rstc);

regmap_update_bits
wait_for_pd

if (on)
	reset_control_deassert(rstc);


Heiko

> > 			if (ret < 0) {
> > 			
> > 				pr_err("%s: could not update power domain\n", __func__);
> > 				
> > 		  		reset_control_put(rstc);
> > 				
> > 				return ret;
> > 			
> > 			}
> > 			
> > 			reset_control_deassert(rstc);
> > 		
> > 		} else {
> > 		
> > 			reset_control_assert(rstc);
> > 			
> > 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> > 			if (ret < 0) {
> > 			
> > 				pr_err("%s: could not update power domain\n", __func__);
> > 				
> > 		  		reset_control_put(rstc);
> > 				
> > 				return ret;
> > 			
> > 			}
> > 		
> > 		}
> > 		
> >>   		reset_control_put(rstc);
> >>   	
> >>   	}
> >> 
> >> -	ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> >> -	if (ret < 0) {
> >> -		pr_err("%s: could not update power domain\n", __func__);
> >> -		return ret;
> >> -	}
> >> -
> >> 
> >>   	ret = -1;
> >>   	while (ret != on) {
> >>   	
> >>   		ret = pmu_power_domain_is_on(pd);
> > 
> > second question - with this patch, what happens actually is
> > 
> >   cpu off:
> >       reset_control_assert
> >       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
> >       wait_for_power_domain_to_turn_off
> >   
> >   cpu on:
> >       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> >       reset_control_deassert
> >       wait_for_power_domain_to_turn_on
> > 
> > So shouldn't the deassertion of the reset happen after the powerdomain
> > sucessfull turned on? Like
> > 
> >   cpu on:
> >       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> >       wait_for_power_domain_to_turn_on
> >       reset_control_deassert
> 
> You are right.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ