linux-kernel - [PATCH] sched: Move sched_entity::avg into separate cache line

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <1449606239-28602-1-git-send-email-jolsa@kernel.org>
Date:	Tue,  8 Dec 2015 21:23:59 +0100
From:	Jiri Olsa <jolsa@...nel.org>
To:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	lkml <linux-kernel@...r.kernel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Don Zickus <dzickus@...hat.com>, Joe Mario <jmario@...hat.com>
Subject: [PATCH] sched: Move sched_entity::avg into separate cache line

From: root <root@...dl380gen9-01.khw.lab.eng.bos.redhat.com>

hi,
I tried Joe's and Don's c2c tool and it identified
a place with cache line contention. There're more
that poped up, this one was just too obvious ;-)

thanks
jirka


---
The sched_entity::avg collides with read-mostly sched_entity data.

The perf c2c tool showed many read HITM accesses across
many CPUs for sched_entity's cfs_rq and my_q, while having
at the same time tons of stores for avg.

After placing sched_entity::avg into separate cache line,
the perf bench sched pipe showed around 20 seconds speedup.

NOTE I cut out all perf events except for cycles and
instructions from following output.

Before:
  $ perf stat -r 5 perf bench sched pipe -l 10000000
  # Running 'sched/pipe' benchmark:
  # Executed 10000000 pipe operations between two processes

       Total time: 270.348 [sec]

        27.034805 usecs/op
            36989 ops/sec
   ...

     245,537,074,035      cycles                    #    1.433 GHz
     187,264,548,519      instructions              #    0.77  insns per cycle

       272.653840535 seconds time elapsed           ( +-  1.31% )

After:
  $ perf stat -r 5 perf bench sched pipe -l 10000000
  # Running 'sched/pipe' benchmark:
  # Executed 10000000 pipe operations between two processes

       Total time: 251.076 [sec]

        25.107678 usecs/op
            39828 ops/sec
  ...

     244,573,513,928      cycles                    #    1.572 GHz
     187,409,641,157      instructions              #    0.76  insns per cycle

       251.679315188 seconds time elapsed           ( +-  0.31% )

Signed-off-by: Jiri Olsa <jolsa@...nel.org>
---
 include/linux/sched.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 3b0de68bce41..80cc1432e6e3 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1268,8 +1268,13 @@ struct sched_entity {
 #endif
 
 #ifdef CONFIG_SMP
-	/* Per entity load average tracking */
-	struct sched_avg	avg;
+	/*
+	 * Per entity load average tracking.
+	 *
+	 * Put into separate cache line so it does not
+	 * collide with read-mostly values above.
+	 */
+	struct sched_avg	avg ____cacheline_aligned_in_smp;
 #endif
 };
 
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/