lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 26 Mar 2018 08:40:42 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Borislav Petkov <bp@...en8.de>
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        Eric Dumazet <edumazet@...gle.com>, x86 <x86@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Hugh Dickins <hughd@...gle.com>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH v3 1/2] x86, msr: allow rdmsr_safe_on_cpu() to schedule


* Borislav Petkov <bp@...en8.de> wrote:

> On Sat, Mar 24, 2018 at 07:29:48AM -0700, Eric Dumazet wrote:
> > It is named gsysd, "Google System Tool", a daemon+cli that is run 
> > on all machines in production to provide a generic interface
> > for interacting with the system hardware.
> 
> So I'm wondering if poking at the hardware like that is a really optimal
> design. Maybe it would be cleaner if the OS would provide properly
> abstracted sysfs interfaces instead of raw MSRs.

It's not ideal to read /dev/msr.

A SysFS interface to enumerate 'system statistics' MSRs already exists via perf, 
under:

  /sys/bus/event_source/devices/msr/events/

This is for simple platform specific hardware statistics counters.

This is implemented in arch/x86/events/msr.c, with a number of MSRs already 
abstracted out:

enum perf_msr_id {
        PERF_MSR_TSC                    = 0,
        PERF_MSR_APERF                  = 1,
        PERF_MSR_MPERF                  = 2,
        PERF_MSR_PPERF                  = 3,
        PERF_MSR_SMI                    = 4,
        PERF_MSR_PTSC                   = 5,
        PERF_MSR_IRPERF                 = 6,
        PERF_MSR_THERM                  = 7,
        PERF_MSR_THERM_SNAP             = 8,
        PERF_MSR_THERM_UNIT             = 9,
        PERF_MSR_EVENT_MAX,
};

It's very easy to add new MSRs:

 9ae21dd66b97: perf/x86/msr: Add support for MSR_IA32_THERM_STATUS
 aaf248848db5: perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter
 8a2242618477: perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support

This commit added an MSR to msr-index and added MSR event support for it as well 
with proper CPU-ID dependent enumeration:

 8a2242618477: perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support

 arch/x86/events/msr.c              | 8 ++++++++
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/include/asm/msr-index.h   | 1 +
 3 files changed, 10 insertions(+)

More complex MSR value encodings are supported as well - see for example 
MSR_IA32_THERM_STATUS, which is encoded as:

               /* if valid, extract digital readout, other set to -1 */
               now = now & (1ULL << 31) ? (now >> 16) & 0x3f :  -1;
               local64_set(&event->count, now);

If an MSR counter is a plain integer counter then no such code has to be added.

Only those MSRs that are valid on a system are 'live', and thus tooling can 
enumerate them programmatically and has easy symbolic access to them:

  galatea:~> perf list | grep msr/

  msr/aperf/                                         [Kernel PMU event]
  msr/cpu_thermal_margin/                            [Kernel PMU event]
  msr/mperf/                                         [Kernel PMU event]
  msr/smi/                                           [Kernel PMU event]
  msr/tsc/                                           [Kernel PMU event]

These can be used as simple counters that are read - and it's using perf not 
/dev/msr so they are fast and scalable - but the full set of perf features is also 
available, so we can do:

 galatea:~> perf stat -a -e msr/aperf/ sleep 1

 Performance counter stats for 'system wide':

       354,442,500      msr/aperf/                                                  

       1.001918252 seconds time elapsed

This interface is vastly superior to /dev/msr: it does not have the various 
usability, scalability and security limitations of /dev/msr.

/dev/msr is basically for debugging.

If performance matters then the relevant MSR events should be added and gsysd 
should be updated to support MSR events.

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ