lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160518114721.GF3193@twins.programming.kicks-ass.net>
Date:	Wed, 18 May 2016 13:47:21 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	"Gaurav Jindal (Gaurav Jindal)" <Gaurav.Jindal@...eadtrum.com>
Cc:	"mingo@...hat.com" <mingo@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Sanjeev Yadav (Sanjeev Kumar Yadav)" <Sanjeev.Yadav@...eadtrum.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [Patch]cpuidle: Save current cpu as local variable instead of
 calling smp_processor_id() in loop


Much better; but the below does not in fact apply.

> ---
> 
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 1214f0a..82698e5 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -185,6 +185,8 @@ exit_idle:
>   */
>  static void cpu_idle_loop(void)
>  {
> +	int cpu_id;
> 
> +	cpu_id = smp_processor_id();

This hunk assumes there's an empty line between the '{' and 'while
(1)', this isn't actually there.

>  	while (1) {
>  		/*
>  		 * If the arch has a polling bit, we maintain an invariant:
> @@ -202,7 +204,7 @@ static void cpu_idle_loop(void)
>  			check_pgt_cache();
>  			rmb();
> 
> -			if (cpu_is_offline(smp_processor_id()))
> +			if (cpu_is_offline(cpu_id))
>                                 arch_cpu_idle_dead();

And this hunk fails to apply because the recent cpu hotplug work added
another call here.

> 
>  			local_irq_disable();

I've taken your patch and modified it; see below.

Another optimization you could look at is removing that rmb(); I don't
actually think its needed, but you'd need to find why it was added and
then check it was in fact needed back then, and then check if it is in
fact still needed now.

Its a bit of a trek through git history, but lfence / dsb ld are
expensive instructions.

---
Subject: sched,idle: Optimize idle loop
From: "Gaurav Jindal (Gaurav Jindal)" <Gaurav.Jindal@...eadtrum.com>
Date: Thu, 12 May 2016 10:13:33 +0000

Currently, smp_processor_id() is used to fetch the current CPU in
cpu_idle_loop. Every time the idle thread runs, it fetches current CPU
using smp_processor_id().

Since the idle thread is per CPU, the current CPU is constant, so we
can lift the load out of the loop, saving execution cycles/time in the
loop.

x86-64:
Before patch (execution in loop):
	148:    0f ae e8                lfence
	14b:    65 8b 04 25 00 00 00 00 mov %gs:0x0,%eax
	152:    00
	153:    89 c0                   mov %eax,%eax
	155:    49 0f a3 04 24          bt %rax,(%r12)

After patch (execution in loop):
	150:    0f ae e8                lfence
	153:    4d 0f a3 34 24          bt %r14,(%r12)

ARM64:
Before patch (execution in loop):
	168:    d5033d9f        dsb     ld
	16c:    b9405661        ldr     w1,[x19,#84]
	170:    1100fc20        add     w0,w1,#0x3f
	174:    6b1f003f        cmp     w1,wzr
	178:    1a81b000        csel    w0,w0,w1,lt
	17c:    130c7000        asr     w0,w0,#6
	180:    937d7c00        sbfiz   x0,x0,#3,#32
	184:    f8606aa0        ldr     x0,[x21,x0]
	188:    9ac12401        lsr     x1,x0,x1
	18c:    36000e61        tbz     w1,#0,358

After patch (execution in loop):
	1a8:    d50339df        dsb     ld
	1ac:    f8776ac0        ldr     x0,[x22,x23]
	ab0:    ea18001f        tst     x0,x24
	1b4:    54000ea0        b.eq    388

Further observance on ARM64 for 4 seconds shows that cpu_idle_loop is
called 8672 times. Shifting the code will save instructions executed
in loop and eventually time as well.

Cc: "mingo@...hat.com" <mingo@...hat.com>
Signed-off-by: Gaurav Jindal <gaurav.jindal@...eadtrum.com>
Reviewed-by: Sanjeev Yadav <sanjeev.yadav@...eadtrum.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Link: http://lkml.kernel.org/r/20160512101330.GA488@gauravjindalubtnb.del.spreadtrum.com
---
 kernel/sched/idle.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -201,6 +201,8 @@ static void cpuidle_idle_call(void)
  */
 static void cpu_idle_loop(void)
 {
+	int cpu = smp_processor_id();
+
 	while (1) {
 		/*
 		 * If the arch has a polling bit, we maintain an invariant:
@@ -219,7 +221,7 @@ static void cpu_idle_loop(void)
 			check_pgt_cache();
 			rmb();
 
-			if (cpu_is_offline(smp_processor_id())) {
+			if (cpu_is_offline(cpu)) {
 				cpuhp_report_idle_dead();
 				arch_cpu_idle_dead();
 			}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ