lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 29 Apr 2016 14:10:25 -0700
From:	David Carrillo-Cisneros <davidcc@...gle.com>
To:	Vikas Shivappa <vikas.shivappa@...ux.intel.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Tony Luck <tony.luck@...el.com>,
	Stephane Eranian <eranian@...gle.com>,
	Paul Turner <pjt@...gle.com>, x86@...nel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/32] 2nd Iteration of Cache QoS Monitoring support.

peterz/queue perf/core

On Fri, Apr 29, 2016 at 2:06 PM Vikas Shivappa
<vikas.shivappa@...ux.intel.com> wrote:
>
>
>
> On Thu, 28 Apr 2016, David Carrillo-Cisneros wrote:
>
> > This series introduces the next iteration of kernel support for the
> > Cache QoS Monitoring (CQM) technology available in Intel Xeon processors.
>
> Wondering what is the kernel version this compiles on ?
>
> Thanks,
> Vikas
>
> >
> > One of the main limitations of the previous version is the inability
> > to simultaneously monitor:
> >  1) cpu event and any other event in that cpu.
> >  2) cgroup events for cgroups in same descendancy line.
> >  3) cgroup events and any thread event of a cgroup in the same
> >     descendancy line.
> >
> > Another limitation is that monitoring for a cgroup was enabled/disabled by
> > the existence of a perf event for that cgroup. Since the event
> > llc_occupancy measures changes in occupancy rather than total occupancy,
> > in order to read meaningful llc_occupancy values, an event should be
> > enabled for a long enough period of time. The overhead in context switches
> > caused by the perf events is undesired in some sensitive scenarios.
> >
> > This series of patches addresses the shortcomings mentioned above and,
> > add some other improvements. The main changes are:
> >       - No more potential conflicts between different events. New
> >       version builds a hierarchy of RMIDs that captures the dependency
> >       between monitored cgroups. llc_occupancy for cgroup is the sum of
> >       llc_occupancies for that cgroup RMID and all other RMIDs in the
> >       cgroups subtree (both monitored cgroups and threads).
> >
> >       - A cgroup integration that allows to monitor the a cgroup without
> >       creating a perf event, decreasing the context switch overhead.
> >       Monitoring is controlled by a boolean cgroup subsystem attribute
> >       in each perf cgroup, this is:
> >
> >               echo 1 > cgroup_path/perf_event.cqm_cont_monitoring
> >
> >       starts CQM monitoring whether or not there is a perf_event
> >       attached to the cgroup. Setting the attribute to 0 makes
> >       monitoring dependent on the existence of a perf_event.
> >       A perf_event is always required in order to read llc_occupancy.
> >       This cgroup integration uses Intel's PQR code and is intended to
> >       be used by upcoming versions of Intel's CAT.
> >
> >       - A more stable rotation algorithm: New algorithm uses SLOs that
> >       guarantee:
> >               - A minimum of enabled time for monitored cgroups and
> >               threads.
> >               - A maximum time disabled before error is introduced by
> >               reusing dirty RMIDs.
> >               - A minimum rate at which RMIDs recycling must progress.
> >
> >       - Reduced impact of stealing/rotation of RMIDs: The new algorithm
> >       accounts the residual occupancy held by limbo RMIDs towards the
> >       former owner of the limbo RMID, decreasing the error introduced
> >       by RMID rotation.
> >       It also allows a limbo RMID to be reused by its former owner when
> >       appropriate, decreasing the potential error of reusing dirty RMIDs
> >       and allowing to make progress even if most limbo RMIDs do not
> >       drop occupancy fast enough.
> >
> >       - Elimination of pmu::count: perf generic's perf_event_count()
> >       perform a quick add of atomic types. The introduction of
> >       pmu::count in the previous CQM series to read occupancy for thread
> >       events changed the behavior of perf_event_count() by performing a
> >       potentially slow IPI and write/read to MSR. It also made pmu::read
> >       to have different behaviors depending on whether the event was a
> >       cpu/cgroup event or a thread. This patches serie removes the custom
> >       pmu::count from CQM and provides a consistent behavior for all
> >       calls of perf_event_read .
> >
> >       - Added error return for pmu::read: Reads to CQM events may fail
> >       due to stealing of RMIDs, even after successfully adding an event
> >       to a PMU. This patch series expands pmu::read with an int return
> >       value and propagates the error to callers that can fail
> >       (ie. perf_read).
> >       The ability to fail of pmu::read is consistent with the recent
> >       changes that allow perf_event_read to fail for transactional
> >       reading of event groups.
> >
> >       - Introduces the field pmu_event_flags that contain flags set by
> >       the PMU to signal variations on the default behavior to perf's
> >       generic code. In this series, three flags are introduced:
> >               - PERF_CGROUP_NO_RECURSION : Signals generic code to add
> >               events of the cgroup ancestors of a cgroup.
> >               - PERF_INACTIVE_CPU_READ_PKG: Signals generic coda that
> >               this CPU event can be read in any CPU in its event::cpu's
> >               package, even if the event is not active.
> >               - PERF_INACTIVE_EV_READ_ANY_CPU: Signals generic code that
> >               this event can be read in any CPU in any package in the
> >               system even if the event is not active.
> >       Using the above flags takes advantage of the CQM's hw ability to
> >       read llc_occupancy even when the associated perf event is not
> >       running in a CPU.
> >
> > This patch series also updates the perf tool to fix error handling and to
> > better handle the idiosyncrasies of snapshot and per-pkg events.
> >
> > David Carrillo-Cisneros (31):
> >  perf/x86/intel/cqm: temporarily remove MBM from CQM and cleanup
> >  perf/x86/intel/cqm: remove check for conflicting events
> >  perf/x86/intel/cqm: remove all code for rotation of RMIDs
> >  perf/x86/intel/cqm: make read of RMIDs per package (Temporal)
> >  perf/core: remove unused pmu->count
> >  x86/intel,cqm: add CONFIG_INTEL_RDT configuration flag and refactor
> >    PQR
> >  perf/x86/intel/cqm: separate CQM PMU's attributes from x86 PMU
> >  perf/x86/intel/cqm: prepare for next patches
> >  perf/x86/intel/cqm: add per-package RMIDs, data and locks
> >  perf/x86/intel/cqm: basic RMID hierarchy with per package rmids
> >  perf/x86/intel/cqm: (I)state and limbo prmids
> >  perf/x86/intel/cqm: add per-package RMID rotation
> >  perf/x86/intel/cqm: add polled update of RMID's llc_occupancy
> >  perf/x86/intel/cqm: add preallocation of anodes
> >  perf/core: add hooks to expose architecture specific features in
> >    perf_cgroup
> >  perf/x86/intel/cqm: add cgroup support
> >  perf/core: adding pmu::event_terminate
> >  perf/x86/intel/cqm: use pmu::event_terminate
> >  perf/core: introduce PMU event flag PERF_CGROUP_NO_RECURSION
> >  x86/intel/cqm: use PERF_CGROUP_NO_RECURSION in CQM
> >  perf/x86/intel/cqm: handle inherit event and inherit_stat flag
> >  perf/x86/intel/cqm: introduce read_subtree
> >  perf/core: introduce PERF_INACTIVE_*_READ_* flags
> >  perf/x86/intel/cqm: use PERF_INACTIVE_*_READ_* flags in CQM
> >  sched: introduce the finish_arch_pre_lock_switch() scheduler hook
> >  perf/x86/intel/cqm: integrate CQM cgroups with scheduler
> >  perf/core: add perf_event cgroup hooks for subsystem attributes
> >  perf/x86/intel/cqm: add CQM attributes to perf_event cgroup
> >  perf,perf/x86,perf/powerpc,perf/arm,perf/*: add int error return to
> >    pmu::read
> >  perf,perf/x86: add hook perf_event_arch_exec
> >  perf/stat: revamp error handling for snapshot and per_pkg events
> >
> > Stephane Eranian (1):
> >  perf/stat: fix bug in handling events in error state
> >
> > arch/alpha/kernel/perf_event.c           |    3 +-
> > arch/arc/kernel/perf_event.c             |    3 +-
> > arch/arm64/include/asm/hw_breakpoint.h   |    2 +-
> > arch/arm64/kernel/hw_breakpoint.c        |    3 +-
> > arch/metag/kernel/perf/perf_event.c      |    5 +-
> > arch/mips/kernel/perf_event_mipsxx.c     |    3 +-
> > arch/powerpc/include/asm/hw_breakpoint.h |    2 +-
> > arch/powerpc/kernel/hw_breakpoint.c      |    3 +-
> > arch/powerpc/perf/core-book3s.c          |   11 +-
> > arch/powerpc/perf/core-fsl-emb.c         |    5 +-
> > arch/powerpc/perf/hv-24x7.c              |    5 +-
> > arch/powerpc/perf/hv-gpci.c              |    3 +-
> > arch/s390/kernel/perf_cpum_cf.c          |    5 +-
> > arch/s390/kernel/perf_cpum_sf.c          |    3 +-
> > arch/sh/include/asm/hw_breakpoint.h      |    2 +-
> > arch/sh/kernel/hw_breakpoint.c           |    3 +-
> > arch/sparc/kernel/perf_event.c           |    2 +-
> > arch/tile/kernel/perf_event.c            |    3 +-
> > arch/x86/Kconfig                         |    6 +
> > arch/x86/events/amd/ibs.c                |    2 +-
> > arch/x86/events/amd/iommu.c              |    5 +-
> > arch/x86/events/amd/uncore.c             |    3 +-
> > arch/x86/events/core.c                   |    3 +-
> > arch/x86/events/intel/Makefile           |    3 +-
> > arch/x86/events/intel/bts.c              |    3 +-
> > arch/x86/events/intel/cqm.c              | 3847 +++++++++++++++++++++---------
> > arch/x86/events/intel/cqm.h              |  519 ++++
> > arch/x86/events/intel/cstate.c           |    3 +-
> > arch/x86/events/intel/pt.c               |    3 +-
> > arch/x86/events/intel/rapl.c             |    3 +-
> > arch/x86/events/intel/uncore.c           |    3 +-
> > arch/x86/events/intel/uncore.h           |    2 +-
> > arch/x86/events/msr.c                    |    3 +-
> > arch/x86/include/asm/hw_breakpoint.h     |    2 +-
> > arch/x86/include/asm/perf_event.h        |   41 +
> > arch/x86/include/asm/pqr_common.h        |   74 +
> > arch/x86/include/asm/processor.h         |    4 +
> > arch/x86/kernel/cpu/Makefile             |    4 +
> > arch/x86/kernel/cpu/pqr_common.c         |   43 +
> > arch/x86/kernel/hw_breakpoint.c          |    3 +-
> > arch/x86/kvm/pmu.h                       |   10 +-
> > drivers/bus/arm-cci.c                    |    3 +-
> > drivers/bus/arm-ccn.c                    |    3 +-
> > drivers/perf/arm_pmu.c                   |    3 +-
> > include/linux/perf_event.h               |   91 +-
> > kernel/events/core.c                     |  170 +-
> > kernel/sched/core.c                      |    1 +
> > kernel/sched/sched.h                     |    3 +
> > kernel/trace/bpf_trace.c                 |    5 +-
> > tools/perf/builtin-stat.c                |   43 +-
> > tools/perf/util/counts.h                 |   19 +
> > tools/perf/util/evsel.c                  |   44 +-
> > tools/perf/util/evsel.h                  |    8 +-
> > tools/perf/util/stat.c                   |   35 +-
> > 54 files changed, 3746 insertions(+), 1337 deletions(-)
> > create mode 100644 arch/x86/events/intel/cqm.h
> > create mode 100644 arch/x86/include/asm/pqr_common.h
> > create mode 100644 arch/x86/kernel/cpu/pqr_common.c
> >
> > --
> > 2.8.0.rc3.226.g39d4020
> >
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ