linux-kernel - [PATCH v2 0/3] perf stat: add per-core count aggregation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <1360846649-6411-1-git-send-email-eranian@google.com>
Date:	Thu, 14 Feb 2013 13:57:26 +0100
From:	Stephane Eranian <eranian@...gle.com>
To:	linux-kernel@...r.kernel.org
Cc:	peterz@...radead.org, mingo@...e.hu, ak@...ux.intel.com,
	acme@...hat.com, jolsa@...hat.com, namhyung.kim@....com
Subject: [PATCH v2 0/3] perf stat: add per-core count aggregation

This patch series contains improvement to the aggregation support
in perf stat.

First, the aggregation code is refactored and a aggr_mode enum
is defined. There is also an important bug fix for the existing
per-socket aggregation.

Second, the option --aggr-socket is renamed --per-socket.

Third, the patch adds a new --per-core option to perf stat.
It aggregates counts per physical core and becomes useful on
systems with hyper-threading. The cores are presented per
socket: S0-C1, means socket 0 core 1. Note that the core number
represents its physical core id. As such, numbers may not always
be contiguous. All of this is based on topology information available
in sysfs.

Per-core aggregation can be combined with interval printing:

 # perf stat -a --per-core -I 1000 -e cycles sleep 100
 #           time core         cpus             counts events
      1.000101160 S0-C0           2      6,051,254,899 cycles                   
      1.000101160 S0-C1           2      6,379,230,776 cycles                   
      1.000101160 S0-C2           2      6,480,268,471 cycles                   
      1.000101160 S0-C3           2      6,110,514,321 cycles                   
      2.000663750 S0-C0           2      6,572,533,016 cycles                   
      2.000663750 S0-C1           2      6,378,623,674 cycles                   
      2.000663750 S0-C2           2      6,264,127,589 cycles                   
      2.000663750 S0-C3           2      6,305,346,613 cycles                   

For instance here on this SNB machine, we can see that the load
is evenly balanced across all 4 physical core (HT is on).

In v2, we print events across all cores or socket and we renamed
--aggr-socket to --per-socket and renamed --aggr-core to --per-core

Signed-off-by: Stephane Eranian <eranian@...gle.com>

Stephane Eranian (3):
  perf stat: refactor aggregation code
  perf stat: rename --aggr-socket to --per-socket
  perf stat: add per-core aggregation

 tools/perf/Documentation/perf-stat.txt |   10 +-
 tools/perf/builtin-stat.c              |  237 ++++++++++++++++++++------------
 tools/perf/util/cpumap.c               |   86 ++++++++++--
 tools/perf/util/cpumap.h               |   12 ++
 4 files changed, 241 insertions(+), 104 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/