lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150624111004.GA5220@linux.vnet.ibm.com>
Date:	Wed, 24 Jun 2015 16:40:04 +0530
From:	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
To:	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Jiri Olsa <jolsa@...nel.org>, Vinson Lee <vlee@...tter.com>,
	Ingo Molnar <mingo@...e.hu>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
Subject: Regression in perf bench numa convergence stats


perf bench numa mem with -c / -m options on v4.1 and latest tip arent
showing correct convergence statistics. I ran git bisect between v4.0 and
v4.1. I have included the patch that fixed the problem for me.

After bisect,  git bisect visualize shows

>From e1e455f4f4d35850c30235747620d0d078fe9f64 Mon Sep 17 00:00:00 2001
From: Vinson Lee <vlee@...tter.com>
Date: Mon, 23 Mar 2015 12:09:16 -0700
Subject: [PATCH] perf tools: Work around lack of sched_getcpu in glibc < 2.6.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This patch fixes this build error with glibc < 2.6.

  CC       util/cloexec.o
cc1: warnings being treated as errors
util/cloexec.c: In function _perf_flag_probe_:
util/cloexec.c:24: error: implicit declaration of function
_sched_getcpu_
util/cloexec.c:24: error: nested extern declaration of _sched_getcpu_
make: *** [util/cloexec.o] Error 1

Signed-off-by: Vinson Lee <vlee@...tter.com>
Acked-by: Jiri Olsa <jolsa@...nel.org>
Acked-by: Namhyung Kim <namhyung@...nel.org>
Cc: Adrian Hunter <adrian.hunter@...el.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
Cc: Paul Mackerras <paulus@...ba.org>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Yann Droneaud <ydroneaud@...eya.com>
Cc: stable@...r.kernel.org # 3.18+
Link: http://lkml.kernel.org/r/1427137761-16119-1-git-send-email-vlee@twopensource.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com>


# git log --oneline e1e455f
e1e455f perf tools: Work around lack of sched_getcpu in glibc < 2.6.
77cfe38 perf kmem: Print big numbers using thousands' group
929a6bb tools lib traceevent: Factor out allocating and processing args
e6d7c91 perf probe: Fix to get ummapped symbol address on kernel
228f14f perf tools: Remove (null) value of "Sort order" for perf mem report
2c7da8c perf annotate: Allow annotation for decompressed kernel modules
bc84f46 perf tools: Try to lookup kernel module map before creating one
907fb50 perf tools: Remove is_kmodule_extension function
e746b3e perf tools: Remove compressed argument from is_kernel_module
8dee9ff perf tools: Use kmod_path__parse in is_kernel_module

To further verify if the problem is because of e1e455f commit, I did roll back to e1e455f
and its parent 77cfe38. I see this problem on more than one system.

# rpm -qa | grep glibc-2
glibc-2.17-55.el7.x86_64


git reset --hard e1e455f

# Running 'numa/mem' benchmark:

# Running main, "perf bench numa numa-mem --no-data_rand_walk -p 1 -t 64 -G 0 -P 0 -T 32 -l 800 -zZ0c"
#
#

 ###
 # 64 tasks will execute (on 4 nodes, 64 CPUs):
 #        800x     0MB global  shared mem operations
 #        800x     0MB process shared mem operations
 #        800x    32MB thread  local  mem operations
 ###

 ###
 #
 # Startup synchronization: ... threads initialized in 0.512908 seconds.
 #
 #    0.1%  [0.0 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}
 #    0.6%  [0.0 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}
 #    5.1%  [0.0 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}
 #    9.6%  [0.1 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}
 #   14.0%  [0.1 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}

 ###

          4.903 secs slowest (max) thread-runtime
          4.873 secs fastest (min) thread-runtime
          4.941 secs average thread-runtime
          0.301 % difference between max/avg runtime
          4.228 GB data processed, per thread
        270.583 GB data processed, total
          1.160 nsecs/byte/thread runtime
          0.862 GB/sec/thread speed
         55.193 GB/sec total speed

and its parent 77cfe38
# git reset --hard 77cfe38

# Running 'numa/mem' benchmark:


# Running main, "perf bench numa numa-mem --no-data_rand_walk -p 1 -t 64 -G 0 -P 0 -T 32 -l 800 -zZ0c"
#
#

 ###
 # 64 tasks will execute (on 4 nodes, 64 CPUs):
 #        800x     0MB global  shared mem operations
 #        800x     0MB process shared mem operations
 #        800x    32MB thread  local  mem operations
 ###

 ###
 #
 # Startup synchronization: ... threads initialized in 0.421336 seconds.
 #
 #    0.4%  [0.0 mins] 16/1  16/1  16/1  16/1  [ 0/4 ] l:  1-20  ( 19) [95.0%] {4-4}
 #    2.6%  [0.0 mins] 17/1  15/1  16/1  16/1  [ 2/4 ] l:  3-37  ( 34) [91.9%] {4-4}
 #    7.1%  [0.0 mins] 17/1  15/1  16/1  16/1  [ 2/4 ] l: 32-67  ( 35) [52.2%] {4-4}
 #   11.8%  [0.1 mins] 17/1  15/1  16/1  16/1  [ 2/4 ] l: 65-103 ( 38) [36.9%] {4-4}
 #   15.9%  [0.1 mins] 17/1  15/1  16/1  16/1  [ 2/4 ] l: 98-136 ( 38) [27.9%] {4-4}

 ###

          4.970 secs slowest (max) thread-runtime
          4.940 secs fastest (min) thread-runtime
          4.980 secs average thread-runtime
          0.300 % difference between max/avg runtime
          4.237 GB data processed, per thread
        271.187 GB data processed, total
          1.173 nsecs/byte/thread runtime
          0.853 GB/sec/thread speed
         54.562 GB/sec total speed


Even reverting e1e455f on top of tip/master seems to avoid the problem.
The below patch fixes the problem.

-- 
Thanks and Regards
Srikar Dronamraju

---->8--------------------------------------------

>From 88199ad8a3d6495080eaa016b87a612bc742b1c4 Mon Sep 17 00:00:00 2001
From: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Date: Wed, 24 Jun 2015 16:23:22 +0530
Subject: [PATCH] perf tools:Fix perf_bench to show proper convergence

With commit: e1e455f (perf tools: Work around lack of sched_getcpu in
glibc < 2.6), perf_bench numa mem with -c or -m option is not able to
correctly calculate convergence. With the above commit, sched_getcpu
always seems to return -1. The intention of commit e1e455f was to add a
sched_getcpu in glibc < 2.6. Hence keep the sched_getcpu definition
under an ifdef.

This regression happened occurred between v4.0 and v4.1

Signed-off-by: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
---
 tools/perf/util/cloexec.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
index 85b5238..2babdda 100644
--- a/tools/perf/util/cloexec.c
+++ b/tools/perf/util/cloexec.c
@@ -7,11 +7,15 @@
 
 static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
 
+#ifdef __GLIBC_PREREQ
+#if !__GLIBC_PREREQ(2, 6)
 int __weak sched_getcpu(void)
 {
 	errno = ENOSYS;
 	return -1;
 }
+#endif
+#endif
 
 static int perf_flag_probe(void)
 {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ