lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1660211399.git.sandipan.das@amd.com>
Date:   Thu, 11 Aug 2022 17:59:48 +0530
From:   Sandipan Das <sandipan.das@....com>
To:     <linux-kernel@...r.kernel.org>, <linux-perf-users@...r.kernel.org>,
        <x86@...nel.org>
CC:     <peterz@...radead.org>, <bp@...en8.de>, <acme@...nel.org>,
        <namhyung@...nel.org>, <jolsa@...nel.org>, <tglx@...utronix.de>,
        <mingo@...hat.com>, <mark.rutland@....com>,
        <alexander.shishkin@...ux.intel.com>,
        <dave.hansen@...ux.intel.com>, <like.xu.linux@...il.com>,
        <eranian@...gle.com>, <ananth.narayan@....com>,
        <ravi.bangoria@....com>, <santosh.shukla@....com>,
        <sandipan.das@....com>
Subject: [PATCH 00/13] perf/x86/amd: Add AMD LbrExtV2 support

Last Branch Record (LBR) is a feature available on modern processors for
recording branch information. It helps determine the flow of control by
logging branch information to registers in realtime and helps with the
detection of hot code paths.

Add support for using AMD Last Branch Record Extension Version 2 (LbrExtV2)
features on Zen 4 processors. New CPU features are introduced for LbrExtV2
detection. New MSR definitions are added for configuring hardware branch
filtering and for enabling the LBR Freeze on PMI feature.

The LBR Freeze on PMI feature is essential for ensuring that branch records
remain consistent with the point of PMU overflow in order to provide a
precise correlation between the two.

Hardware branch filtering allows users to record only specific types of
branches and can be mapped to most of the existing filters supported by the
perf tool. Additional software filtering ensures that some special branches
(syscall entry and exit) for which direct hardware filters do not exist are
also recorded. This expands the scope of filters like "any_call".

Additionally, the perf UAPI is now extended to provide branch speculation
information, if available. LbrExtV2 provides this information through the
"valid" and "spec" bits in the Branch To registers. The tools-side changes
for this will be submitted as a separate series.

Users of perf tool can now record branches as shown below. The 'div'
workload used here is from https://lwn.net/Articles/680985/.

E.g.

  $ perf record -b -e cycles:u ./div

Before:

  Error:
  cycles:u: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'

After:

  [ perf record: Woken up 49 times to write data ]
  [ perf record: Captured and wrote 12.197 MB perf.data (29601 samples) ]

  $ perf report --stdio

  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 473K of event 'cycles:u'
  # Event count (approx.): 473521
  #
  # Overhead  Command  Source Shared Object  Source Symbol           Target Symbol           Basic Block Cycles
  # ........  .......  ....................  ......................  ......................  ..................
  #
      29.69%  div      div                   [.] main                [.] main                -
      23.84%  div      div                   [.] compute_flag        [.] main                -
      23.41%  div      div                   [.] compute_flag        [.] compute_flag        -
      23.04%  div      div                   [.] main                [.] compute_flag        -
  [...]

No additional failures are seen upon running the following:
  * perf built-in test suite
  * perf_event_tests suite

Sandipan Das (13):
  perf/x86/amd/brs: Move feature-specific functions
  perf/x86/amd/core: Refactor branch attributes
  perf/x86/amd/core: Add generic branch record interfaces
  x86/cpufeatures: Add LbrExtV2 feature bit
  perf/x86/amd/lbr: Detect LbrExtV2 support
  perf/x86/amd/lbr: Add LbrExtV2 branch record support
  perf/x86/amd/lbr: Add LbrExtV2 hardware branch filter support
  perf/x86: Move branch classifier
  perf/x86/amd/lbr: Add LbrExtV2 software branch filter support
  perf/x86: Make branch classifier fusion-aware
  perf/x86/amd/lbr: Use fusion-aware branch classifier
  perf/core: Add speculation info to branch entries
  perf/x86/amd/lbr: Add LbrExtV2 branch speculation info support

 arch/x86/events/Makefile           |   2 +-
 arch/x86/events/amd/Makefile       |   2 +-
 arch/x86/events/amd/brs.c          |  69 ++++-
 arch/x86/events/amd/core.c         | 200 +++++++------
 arch/x86/events/amd/lbr.c          | 435 +++++++++++++++++++++++++++++
 arch/x86/events/intel/lbr.c        | 273 ------------------
 arch/x86/events/perf_event.h       |  81 +++++-
 arch/x86/events/utils.c            | 247 ++++++++++++++++
 arch/x86/include/asm/cpufeatures.h |   2 +-
 arch/x86/include/asm/msr-index.h   |   5 +
 arch/x86/include/asm/perf_event.h  |   3 +-
 arch/x86/kernel/cpu/scattered.c    |   1 +
 include/linux/perf_event.h         |   1 +
 include/uapi/linux/perf_event.h    |  15 +-
 14 files changed, 952 insertions(+), 384 deletions(-)
 create mode 100644 arch/x86/events/amd/lbr.c
 create mode 100644 arch/x86/events/utils.c

-- 
2.34.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ