lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAE+MWFsoX2wOEiYkXsZop-K7+ehNzf5eAG-rSy7+mbqf5k0vuQ@mail.gmail.com>
Date:   Fri, 9 Mar 2018 02:29:55 -0500
From:   Will Hawkins <whh8b@...ginia.edu>
To:     Steven Rostedt <rostedt@...dmis.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: x86 performance monitor counters save/restore on context switch

Mr. Rostedt and others interested reading on the LKML,

I hope that this is the proper venue to ask this (longwinded)
question. If it is not, I apologize for the SPAM and wasting
everyone's time and bits. I am emailing to ask for clarification about
the "policy" of saving and restoring x86 performance monitor counters
(and other PMU-related registers) on context switch in the Kernel.

Having plumbed through the code for scheduling, I get the sense that
code in the perf subsystem is the only code that would, if conditions
are right, save/restore performance registers on a context switch.

In my investigation, I started from the top where
prepare_task_switch() calls perf_event_task_sched_out() and where
finish_task_switch() calls perf_event_task_sched_in(). Having traced
the implementation of each of those functions to (what I think is)
their lowest levels, the Kernel will only save and restore performance
monitor counters if:

1. The task, process of task's CPU is actively monitoring performance.
That monitoring would have been initiated by a user by calling
perf_event_open() (or using a high level library that eventually calls
that function).
2. The performance aspects being monitored are hardware counters/events.

I am sure that there are other conditions, but those are the two that
stuck out to me the most.

All that is a long (perhaps incorrect) preface to a very simple question:

Is it only the performance counting registers that are actively in use
(again, as told to the perf subsystem by a call to perf_event_open())
that are saved/restored on context switch?

I ask because I have written code (mostly out of curiosity and not
necessarily for production) that accesses those registers directly by
writing/reading their values through the msr kernel module. If what I
said above is correct, then I have to be wary of the fact that the
values read from those counters reflect statistics from all the
processes/threads running on the same CPU at the same time. At first
blush, this was the way I expected the performance monitoring
registers and counters to work, but I wanted to confirm and you seemed
like the right person to ask.

If I was wrong about asking for your help, I apologize and hope that I
didn't waste your valuable time.

Thanks for all the work that you do on the performance monitoring
systems for Linux -- they are invaluable for debugging those
hard-to-find bottlenecks that inevitably pop up when you really need
something to "just work."

Will



.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ