lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 May 2016 09:32:57 +0000
From:	"Gaurav Jindal (Gaurav Jindal)" <Gaurav.Jindal@...eadtrum.com>
To:	"peterz@...radead.org" <peterz@...radead.org>,
	"mingo@...hat.com" <mingo@...hat.com>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Sanjeev Yadav (Sanjeev Kumar Yadav)" <Sanjeev.Yadav@...eadtrum.com>
Subject: [Patch]cpuidle: Save current cpu as local once instead of calling
 smp_processor_id() in loop

Hi

Currently, smp_processor_id() is used to fetch the current cpu in cpu_idle_loop.
Everytime the idle thread runs, it fetches the current cpu using
smp_processor_id().

For idle thread which is per cpu, current cpu is constant and cannot
change at runtime. So moving the smp_processor_id() before the loop
saves execution cycles/time in loop.

Patch:
----------------------------------------------------------------------

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 1214f0a..82698e5 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -185,6 +185,8 @@ exit_idle:
*/
static void cpu_idle_loop(void)
{
+       int cpu_id;
+       cpu_id = smp_processor_id();

        while(1) {
            /*
            * If the arch has a polling bit, we maintain an invariant:
@@ -202,7 +204,7 @@ static void cpu_idle_loop(void)
                        check_pgt_cache();
                        rmb();

-                       if (cpu_is_offline(smp_processor_id()))
+                       if (cpu_is_offline(cpu_id))
                                arch_cpu_idle_dead();

                        local_irq_disable();

--------------------------------------------------------------------

With patch I observed the assembly code(x-86 and ARM64), it saves
instructions related to smp_processor_id().

For x-86:

Before patch(execution in loop):

148:   0f ae e8                lfence
14b:   65 8b 04 25 00 00 00    mov    %gs:0x0,%eax
152:   00
153:   89 c0                   mov    %eax,%eax
155:   49 0f a3 04 24          bt     %rax,(%r12)

After patch(execution in loop):

150:   0f ae e8                lfence
153:   4d 0f a3 34 24          bt     %r14,(%r12)


For ARM64:

Before patch(execution in loop):

168:   d5033d9f        dsb     ld
16c:   b9405661        ldr     w1, [x19,#84]
170:   1100fc20        add     w0, w1, #0x3f
174:   6b1f003f        cmp     w1, wzr
178:   1a81b000        csel    w0, w0, w1, lt
17c:   13067c00        asr     w0, w0, #6
180:   937d7c00        sbfiz   x0, x0, #3, #32
184:   f8606aa0        ldr     x0, [x21,x0]
188:   9ac12401        lsr     x1, x0, x1
18c:   36000e61        tbz     w1, #0, 358

After patch(execution in loop):

1a8:   d5033d9f        dsb     ld
1ac:   f8776ac0        ldr     x0, [x22,x23]
1b0:   ea18001f        tst     x0, x24
1b4:   54000ea0        b.eq    388

Further observance for 4 seconds on ARM64 architecture shows that cpu_idle_loop is
hit 8672 times. If calculation mechanism is changed it will save
instructions and eventually time as well.

Signed-off-by: gaurav jindal<gaurav.jindal@...eadtrum.com>
Reviewed-by: sanjeev yadav<sanjeev.yadav@...eadtrum.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ