lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZurLc9qEjBH9MkvK@gmail.com>
Date: Wed, 18 Sep 2024 14:45:39 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Jiri Olsa <jolsa@...hat.com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Mark Rutland <mark.rutland@....com>,
	Namhyung Kim <namhyung@...nel.org>,
	linux-perf-users@...r.kernel.org
Subject: [GIT PULL] Performance events changes for v6.12

Linus,

Please pull the latest perf/core Git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf-core-2024-09-18

   # HEAD: 5e645f31139183ac9a282238da18ca6bbc1c6f4a Merge branch 'perf/urgent' into perf/core, to pick up fixes

   [ Merge note: this pull request depends on you having pulled perf-urgent-2024-09-18 already. ]

Performance events changes for v6.12:

 - Implement per-PMU context rescheduling to significantly improve single-PMU
   performance, and related cleanups/fixes. (by Peter Zijlstra and Namhyung Kim)

 - Fix ancient bug resulting in a lot of events being dropped erroneously
   at higher sampling frequencies. (by Luo Gengkun)

 - uprobes enhancements:

     - Implement RCU-protected hot path optimizations for better performance:

         "For baseline vs SRCU, peak througput increased from 3.7 M/s (million uprobe
          triggerings per second) up to about 8 M/s. For uretprobes it's a bit more
          modest with bump from 2.4 M/s to 5 M/s.

          For SRCU vs RCU Tasks Trace, peak throughput for uprobes increases further from
          8 M/s to 10.3 M/s (+28%!), and for uretprobes from 5.3 M/s to 5.8 M/s (+11%),
          as we have more work to do on uretprobes side.

          Even single-thread (no contention) performance is slightly better: 3.276 M/s to
          3.396 M/s (+3.5%) for uprobes, and 2.055 M/s to 2.174 M/s (+5.8%)
          for uretprobes."

          (by Andrii Nakryiko et al)

     - Document mmap_lock, don't abuse get_user_pages_remote(). (by Oleg Nesterov)

     - Cleanups & fixes to prepare for future work:

        - Remove uprobe_register_refctr()
	- Simplify error handling for alloc_uprobe()
        - Make uprobe_register() return struct uprobe *
        - Fold __uprobe_unregister() into uprobe_unregister()
        - Shift put_uprobe() from delete_uprobe() to uprobe_unregister()
        - BPF: Fix use-after-free in bpf_uprobe_multi_link_attach()

          (by Oleg Nesterov)

 - New feature & ABI extension: allow events to use PERF_SAMPLE READ with
   inheritance, enabling sample based profiling of a group of counters over
   a hierarchy of processes or threads.  (by Ben Gainey)

 - Intel uncore & power events updates:

      - Add Arrow Lake and Lunar Lake support
      - Add PERF_EV_CAP_READ_SCOPE
      - Clean up and enhance cpumask and hotplug support

        (by Kan Liang)

      - Add LNL uncore iMC freerunning support
      - Use D0:F0 as a default device

        (by Zhenyu Wang)

 - Intel PT: fix AUX snapshot handling race. (by Adrian Hunter)

 - Misc fixes and cleanups. (by James Clark, Jiri Olsa, Oleg Nesterov and Peter Zijlstra)

Thanks,

	Ingo

------------------>

Adrian Hunter (1):
      perf/x86/intel/pt: Fix sampling synchronization

Andrii Nakryiko (7):
      perf,x86: avoid missing caller address in stack traces captured in uprobe
      uprobes: simplify error handling for alloc_uprobe()
      uprobes: revamp uprobe refcounting and lifetime management
      uprobes: protected uprobe lifetime with SRCU
      uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks
      uprobes: travers uprobe's consumer list locklessly under SRCU protection
      uprobes: perform lockless SRCU-protected uprobes_tree lookup

Ben Gainey (2):
      perf: Rename perf_event_context.nr_pending to nr_no_switch_fast.
      perf: Support PERF_SAMPLE_READ with inherit

Ingo Molnar (2):
      Merge branch 'perf/urgent' into perf/core, to pick up fixes
      Merge branch 'perf/urgent' into perf/core, to pick up fixes

James Clark (1):
      perf/x86/intel/bts: Fix comment about default perf_event_paranoid setting

Jiri Olsa (1):
      selftests/bpf: fix uprobe.path leak in bpf_testmod

Kan Liang (8):
      perf/x86/intel/uncore: Add Arrow Lake support
      perf/x86/intel/uncore: Factor out common MMIO init and ops functions
      perf/x86/intel/uncore: Add Lunar Lake support
      perf: Generic hotplug support for a PMU with a scope
      perf: Add PERF_EV_CAP_READ_SCOPE
      perf/x86/intel/cstate: Clean up cpumask and hotplug
      iommu/vt-d: Clean up cpumask and hotplug for perfmon
      dmaengine: idxd: Clean up cpumask and hotplug for perfmon

Luo Gengkun (1):
      perf/core: Fix small negative period being ignored

Namhyung Kim (1):
      perf: Really fix event_function_call() locking

Oleg Nesterov (8):
      uprobes: document the usage of mm->mmap_lock
      uprobes: is_trap_at_addr: don't use get_user_pages_remote()
      uprobes: kill uprobe_register_refctr()
      uprobes: make uprobe_register() return struct uprobe *
      uprobes: change uprobe_register() to use uprobe_unregister() instead of __uprobe_unregister()
      uprobes: fold __uprobe_unregister() into uprobe_unregister()
      uprobes: shift put_uprobe() from delete_uprobe() to uprobe_unregister()
      bpf: Fix use-after-free in bpf_uprobe_multi_link_attach()

Peter Zijlstra (8):
      perf/x86: Add hw_perf_event::aux_config
      perf: Optimize context reschedule for single PMU cases
      perf: Extract a few helpers
      perf: Fix event_function_call() locking
      perf: Add context time freeze
      perf: Optimize __pmu_ctx_sched_out()
      perf/uprobe: split uprobe_unregister()
      rbtree: provide rb_find_rcu() / rb_find_add_rcu()

Zhenyu Wang (2):
      perf/x86/intel/uncore: Add LNL uncore iMC freerunning support
      perf/x86/intel/uncore: Use D0:F0 as a default device

 arch/x86/events/core.c                                |  63 +++++++++++++++++++
 arch/x86/events/intel/bts.c                           |   3 -
 arch/x86/events/intel/cstate.c                        | 142 ++-----------------------------------------
 arch/x86/events/intel/pt.c                            |  29 +++++----
 arch/x86/events/intel/uncore.c                        |   9 +++
 arch/x86/events/intel/uncore.h                        |   2 +
 arch/x86/events/intel/uncore_snb.c                    | 185 ++++++++++++++++++++++++++++++++++++++++++++++++++------
 drivers/dma/idxd/idxd.h                               |   7 ---
 drivers/dma/idxd/init.c                               |   3 -
 drivers/dma/idxd/perfmon.c                            |  98 +-----------------------------
 drivers/iommu/intel/iommu.h                           |   2 -
 drivers/iommu/intel/perfmon.c                         | 111 +---------------------------------
 include/linux/cpuhotplug.h                            |   2 -
 include/linux/perf_event.h                            |  32 +++++++++-
 include/linux/rbtree.h                                |  67 +++++++++++++++++++++
 include/linux/uprobes.h                               |  48 ++++++++-------
 kernel/events/core.c                                  | 586 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------
 kernel/events/uprobes.c                               | 505 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------------------------
 kernel/trace/bpf_trace.c                              |  38 ++++++------
 kernel/trace/trace_uprobe.c                           |  44 +++++++-------
 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c |  27 +++++----
 21 files changed, 1146 insertions(+), 857 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ