[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160226073717.GA3884@gmail.com>
Date: Fri, 26 Feb 2016 08:37:18 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Marty McFadden <mcfadden8@...l.gov>
Cc: ak@...ux.intel.com, andriy.shevchenko@...ux.intel.com,
bp@...en8.de, bp@...e.de, brgerst@...il.com,
dan.j.williams@...el.com, dyoung@...hat.com, hpa@...or.com,
linux@...izon.com, linux-kernel@...r.kernel.org, luto@...nel.org,
mingo@...hat.com, pavel@....cz, tglx@...utronix.de,
viro@...iv.linux.org.uk, x86@...nel.org, yu.c.chen@...el.com,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Jiri Olsa <jolsa@...hat.com>
Subject: Re: [PATCH 0/4] MSR: MSR: MSR Whitelist and Batch Introduction
* Marty McFadden <mcfadden8@...l.gov> wrote:
>
> This patch addresses the following two problems:
> 1. The current msr module grants all-or-nothing access to MSRs,
> thus making user-level runtime performance adjustments
> problematic, particularly for power-constrained HPC systems.
>
> 2. The current msr module requires a separate system call and the
> acquisition of the preemption lock for each individual MSR access.
> This overhead degrades performance of runtime tools that would
> ideally sample multiple MSRs at high frequencies.
No, we really don't want to touch the old MSR code - it's a very opaque API with
various deep limitations.
What I'd like to see instead is to use a modern system monitoring interface - and
in fact that already happened in the last kernel release, we added the perf MSR
access methods via:
commit b7b7c7821d932ba18ef6c8eafc8536066b4c2ef4
Author: Andy Lutomirski <luto@...nel.org>
Date: Mon Jul 20 11:49:06 2015 -0400
perf/x86: Add an MSR PMU driver
This patch adds an MSR PMU to support free running MSR counters. Such
as time and freq related counters includes TSC, IA32_APERF, IA32_MPERF
and IA32_PPERF, but also SMI_COUNT.
The events are exposed in sysfs for use by perf stat and other tools.
The files are under /sys/devices/msr/events/
see arch/x86/cpu/perf/msr.c, or arch/x86/events/msr.c in the latest perf tree:
git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core
For example with the perf ABIs 'batch access' of a group of MSRs is easy: a group
of events can be read or sampled at once. It can be done in a system-wide, per
task or per task hierarchy fashion, with cgroup management as well - it's a modern
API.
Right now the MSR PMU code is only at its first version, with only these few MSRs
exposed:
enum perf_msr_id {
PERF_MSR_TSC = 0,
PERF_MSR_APERF = 1,
PERF_MSR_MPERF = 2,
PERF_MSR_PPERF = 3,
PERF_MSR_SMI = 4,
PERF_MSR_EVENT_MAX,
};
but that can (and should) be expanded and more features can be added.
Thanks,
Ingo
Powered by blists - more mailing lists