lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5CA1D4FD.9000104@intel.com>
Date:   Mon, 01 Apr 2019 17:08:13 +0800
From:   Wei Wang <wei.w.wang@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Like Xu <like.xu@...ux.intel.com>
CC:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        like.xu@...el.com, Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Ingo Molnar <mingo@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC] [PATCH v2 0/5] Intel Virtual PMU Optimization

On 03/25/2019 03:19 PM, Peter Zijlstra wrote:
> On Mon, Mar 25, 2019 at 02:47:32PM +0800, Like Xu wrote:
>> On 2019/3/24 1:28, Peter Zijlstra wrote:
>>> On Sat, Mar 23, 2019 at 10:18:03PM +0800, Like Xu wrote:
>>>> === Brief description ===
>>>>
>>>> This proposal for Intel vPMU is still committed to optimize the basic
>>>> functionality by reducing the PMU virtualization overhead and not a blind
>>>> pass-through of the PMU. The proposal applies to existing models, in short,
>>>> is "host perf would hand over control to kvm after counter allocation".
>>>>
>>>> The pmc_reprogram_counter is a heavyweight and high frequency operation
>>>> which goes through the host perf software stack to create a perf event for
>>>> counter assignment, this could take millions of nanoseconds. The current
>>>> vPMU always does reprogram_counter when the guest changes the eventsel,
>>>> fixctrl, and global_ctrl msrs. This brings too much overhead to the usage
>>>> of perf inside the guest, especially the guest PMI handling and context
>>>> switching of guest threads with perf in use.
>>> I think I asked for starting with making pmc_reprogram_counter() less
>>> retarded. I'm not seeing that here.
>> Do you mean pass perf_event_attr to refactor pmc_reprogram_counter
>> via paravirt ? Please share more details.
> I mean nothing; I'm trying to understand wth you're doing.

I also feel the description looks confusing (sorry for being late to
join in due to leaves). Also the code needs to be improved a lot.


Please see the basic idea here:

reprogram_counter is a heavyweight operation which goes through the
perf software stack to create a perf event, this could take millions of
nanoseconds. The current KVM vPMU always does reprogram_counter
when the guest changes the eventsel, fixctrl, and global_ctrl msrs. This
brings too much overhead to the usage of perf inside the guest, especially
the guest PMI handling and context switching of guest threads with perf in
use.

In fact, during the guest perf event life cycle, it mostly only toggles the
enable bit of eventsel or fixctrl. From the KVM point of view, if the guest
only toggles the enable bits, it is not necessary to do reprogram_counter,
because it is serving the same guest perf event. So the "enable bit" can
be directly applied to the hardware msr that the corresponding host event
is occupying.

We optimize the current vPMU to work in this manner:
1) rely on the existing host perf (perf_event_create_kernel_counter) to
create a perf event for each vPMC. This creation is only needed when
guest writes a complete new value to eventsel or fixctrl.

2) vPMU captures guest accesses to the eventsel and fixctrl msrs.
If the guest only toggles the enable bit, then we don't need to
reprogram_pmc_counter, as the vPMC is serving the same guest
event. So KVM only updates the enable bit directly to the hardware
msr that the corresponding host event is scheduled on.

3) When the host perf reschedules perf counters and happens to
have the vPMC's perf event scheduled out, KVM will do
reprogram_counter.

4) We use a lazy approach to release the vPMC's perf event. That is,
if the vPMC wasn't used for a vCPU time slice, the corresponding perf
event will be released via kvm calling perf_event_release_kernel.

Regarding who updates the underlying hardware counter:
The change here is when a perf event is used by the guest
(i.e. exclude_host=true or using a new flag if necessary), perf doesn't
update the hardware counter (e.g. a counter's event_base and config_base),
instead, the hypervisor helps to update them.

Hope the above has made it clear to understand. Thanks!

Best,
Wei

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ