lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 24 Mar 2016 21:18:53 +0800
From:	Jisheng Zhang <jszhang@...vell.com>
To:	Will Deacon <will.deacon@....com>, <catalin.marinas@....com>,
	<lorenzo.pieralisi@....com>, <daniel.lezcano@...aro.org>
CC:	<linux-arm-kernel@...ts.infradead.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/2] arm64: cpuidle: make arm_cpuidle_suspend() more
 efficient

Hi Will,

On Thu, 24 Mar 2016 11:15:07 +0000 Will Deacon wrote:

> On Thu, Mar 24, 2016 at 01:08:48PM +0800, Jisheng Zhang wrote:
> > This series is to improve the arm_cpuidle_suspend() a bit by removing/moving
> > out checks from this hot path.
> > 
> > Jisheng Zhang (2):
> >   arm64: cpuidle: remove cpu_ops check from arm_cpuidle_suspend()
> >   arm64: cpuidle: make arm_cpuidle_suspend() a bit more efficient
> > 
> >  arch/arm64/kernel/cpuidle.c | 9 ++-------
> >  1 file changed, 2 insertions(+), 7 deletions(-)  
> 
> These look fine to me, but do you have any rough numbers showing what
> sort of improvement we get from this change?

Good question. Here it is:

I measured the 4096 * time from arm_cpuidle_suspend entry point to the
cpu_psci_cpu_suspend entry point. HW platform is Marvell BG4CT STB board.

1. only one shell, no other process, hot-unplug secondary cpus, execute the
following cmd

while true
do
	sleep 0.2
done

before the patch: 1581220ns

after the patch: 1579630ns

reduced by 0.1%

2. only one shell, no other process, hot-unplug secondary cpus, execute the
following cmd

while true
do
	md5sum /tmp/testfile
	sleep 0.2
done

NOTE the testfile size should be larger than L1+L2 cache size

before the patch: 1961960ns
after the patch: 1912500ns

reduced by 2.5%

So the more complex the system load, the bigger the improvement.

Thanks,
Jisheng

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ