lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <286AC319A985734F985F78AFA26841F73E21BB5C@shsmsx102.ccr.corp.intel.com>
Date:   Fri, 6 Sep 2019 08:50:30 +0000
From:   "Wang, Wei W" <wei.w.wang@...el.com>
To:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>
CC:     "Liang, Kan" <kan.liang@...el.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "rkrcmar@...hat.com" <rkrcmar@...hat.com>,
        "Xu, Like" <like.xu@...el.com>,
        "jannh@...gle.com" <jannh@...gle.com>,
        "arei.gonglei@...wei.com" <arei.gonglei@...wei.com>,
        "jmattson@...gle.com" <jmattson@...gle.com>
Subject: RE: [PATCH v8 00/14] Guest LBR Enabling

A polite ping for comments on this version, thanks!

On Tuesday, August 6, 2019 3:16 PM, Wei Wang wrote:
> Last Branch Recording (LBR) is a performance monitor unit (PMU) feature on
> Intel CPUs that captures branch related info. This patch series enables this
> feature to KVM guests.
> 
> Each guest can be configured to expose this LBR feature to the guest via
> userspace setting the enabling param in KVM_CAP_X86_GUEST_LBR (patch
> 3).
> 
> About the lbr emulation method:
> Since the vcpu get scheduled in, the lbr related msrs are made interceptible.
> This makes guest first access to a lbr related msr always vm-exit to kvm, so
> that kvm can know whether the lbr feature is used during the vcpu time slice.
> The kvm lbr msr handler does the following
> things:
>   - create an lbr perf event (task pinned) for the vcpu thread.
>     The perf event mainly serves 2 purposes:
>       -- follow the host perf scheduling rules to manage the vcpu's usage
>          of lbr (e.g. a cpu pinned lbr event could reclaim lbr and thus
>          stopping the vcpu's use);
>       -- have the host perf do context switching of the lbr state on the
>          vcpu thread switching.
>   - pass the lbr related msrs through to the guest.
>     This enables the following guest accesses to the lbr related msrs
>     without vm-exit, as long as the vcpu's lbr event owns the lbr feature.
>     A cpu pinned lbr event on the host could come and take over the lbr
>     feature via IPI calls. In this case, the pass-through will be
>     cancelled (patch 13), and the guest following accesses to the lbr msrs
>     will vm-exit to kvm and accesses will be forbidden in the handler.
> 
> If the guest doesn't touch any of the lbr related msrs (likely the guest doesn't
> need to run lbr in the near future), the vcpu's lbr perf event will be freed
> (please see patch 12 commit for more details).
> 
> * Tests
> Conclusion: the profiling results on the guest are similar to that on the host.
> 
> Run: ./perf -b ./test_program
> 
> - Test on the host:
> Overhead  Command  Source Shared Object  Source Symbol    Target
> Symbol
>   22.35%  ftest    libc-2.23.so          [.] __random     [.]
> __random
>    8.20%  ftest    ftest                 [.] qux          [.] qux
>    5.88%  ftest    ftest                 [.] random@plt   [.]
> __random
>    5.88%  ftest    libc-2.23.so          [.] __random     [.]
> __random_r
>    5.79%  ftest    ftest                 [.] main         [.]
> random@plt
>    5.60%  ftest    ftest                 [.] main         [.] foo
>    5.24%  ftest    libc-2.23.so          [.] __random     [.] main
>    5.20%  ftest    libc-2.23.so          [.] __random_r   [.]
> __random
>    5.00%  ftest    ftest                 [.] foo          [.] qux
>    4.91%  ftest    ftest                 [.] main         [.] bar
>    4.83%  ftest    ftest                 [.] bar          [.] qux
>    4.57%  ftest    ftest                 [.] main         [.] main
>    4.38%  ftest    ftest                 [.] foo          [.] main
>    4.13%  ftest    ftest                 [.] qux          [.] foo
>    3.89%  ftest    ftest                 [.] qux          [.] bar
>    3.86%  ftest    ftest                 [.] bar          [.] main
> 
> - Test on the guest:
> Overhead  Command  Source Shaged Object  Source Symbol    Target
> Symbol
>   22.36%  ftest    libc-2.23.so          [.] random       [.] random
>    8.55%  ftest    ftest                 [.] qux          [.] qux
>    5.79%  ftest    libc-2.23.so          [.] random       [.]
> random_r
>    5.64%  ftest    ftest                 [.] random@plt   [.]
> random
>    5.58%  ftest    ftest                 [.] main         [.]
> random@plt
>    5.55%  ftest    ftest                 [.] main         [.] foo
>    5.41%  ftest    libc-2.23.so          [.] random       [.] main
>    5.31%  ftest    libc-2.23.so          [.] random_r     [.] random
>    5.11%  ftest    ftest                 [.] foo          [.] qux
>    4.93%  ftest    ftest                 [.] main         [.] main
>    4.59%  ftest    ftest                 [.] qux          [.] bar
>    4.49%  ftest    ftest                 [.] bar          [.] main
>    4.42%  ftest    ftest                 [.] bar          [.] qux
>    4.16%  ftest    ftest                 [.] main         [.] bar
>    3.95%  ftest    ftest                 [.] qux          [.] foo
>    3.79%  ftest    ftest                 [.] foo          [.] main
> (due to the lib version difference, "random" is equavlent to __random above)
> 
> v7->v8 Changelog:
>   - Patch 3:
>     -- document KVM_CAP_X86_GUEST_LBR in api.txt
>     -- make the check of KVM_CAP_X86_GUEST_LBR return the size of
>        struct x86_perf_lbr_stack, to let userspace do a compatibility
>        check.
>   - Patch 7:
>     -- support perf scheduler to not assign a counter for the perf event
>        that has PERF_EV_CAP_NO_COUNTER set (rather than skipping the
> perf
>        scheduler). This allows the scheduler to detect lbr usage conflicts
>        via get_event_constraints, and lower priority events will finally
>        fail to use lbr.
>     -- define X86_PMC_IDX_NA as "-1", which represents a never assigned
>        counter id. There are other places that use "-1", but could be
>        updated to use the new macro in another patch series.
>   - Patch 8:
>     -- move the event->owner assignment into perf_event_alloc to have it
>        set before event_init is called. Please see this patch's commit for
>        reasons.
>   - Patch 9:
>     -- use "exclude_host" and "is_kernel_event" to decide if the lbr event
>        is used for the vcpu lbr emulation, which doesn't need a counter,
>        and removes the usage of the previous new perf_event_create API.
>     -- remove the unused attr fields.
>   - Patch 10:
>     -- set a hardware reserved bit (bit 62 of LBR_SELECT) to reg->config
>        for the vcpu lbr emulation event. This makes the config different
>        from other host lbr event, so that they don't share the lbr.
>        Please see the comments in the patch for the reasons why they
>        shouldn't share.
>   - Patch 12:
>     -- disable interrupt and check if the vcpu lbr event owns the lbr
>        feature before kvm writing to the lbr related msr. This avoids kvm
>        updating the lbr msrs after lbr has been reclaimed by other events
>        via ipi.
>     -- remove arch v4 related support.
>   - Patch 13:
>     -- double check if the vcpu lbr event owns the lbr feature before
>        vm-entry into the guest. The lbr pass-through will be cancelled if
>        lbr feature has been reclaimed by a cpu pinned lbr event.
> 
> Previous:
> https://lkml.kernel.org/r/1562548999-37095-1-git-send-email-wei.w.wang
> @intel.com
> 
> Wei Wang (14):
>   perf/x86: fix the variable type of the lbr msrs
>   perf/x86: add a function to get the addresses of the lbr stack msrs
>   KVM/x86: KVM_CAP_X86_GUEST_LBR
>   KVM/x86: intel_pmu_lbr_enable
>   KVM/x86/vPMU: tweak kvm_pmu_get_msr
>   KVM/x86: expose MSR_IA32_PERF_CAPABILITIES to the guest
>   perf/x86: support to create a perf event without counter allocation
>   perf/core: set the event->owner before event_init
>   KVM/x86/vPMU: APIs to create/free lbr perf event for a vcpu thread
>   perf/x86/lbr: don't share lbr for the vcpu usage case
>   perf/x86: save/restore LBR_SELECT on vcpu switching
>   KVM/x86/lbr: lbr emulation
>   KVM/x86/vPMU: check the lbr feature before entering guest
>   KVM/x86: remove the common handling of the debugctl msr
> 
>  Documentation/virt/kvm/api.txt    |  26 +++
>  arch/x86/events/core.c            |  36 ++-
>  arch/x86/events/intel/core.c      |   3 +
>  arch/x86/events/intel/lbr.c       |  95 +++++++-
>  arch/x86/events/perf_event.h      |   6 +-
>  arch/x86/include/asm/kvm_host.h   |   5 +
>  arch/x86/include/asm/perf_event.h |  17 ++
>  arch/x86/kvm/cpuid.c              |   2 +-
>  arch/x86/kvm/pmu.c                |  24 +-
>  arch/x86/kvm/pmu.h                |  11 +-
>  arch/x86/kvm/pmu_amd.c            |   7 +-
>  arch/x86/kvm/vmx/pmu_intel.c      | 476
> +++++++++++++++++++++++++++++++++++++-
>  arch/x86/kvm/vmx/vmx.c            |   4 +-
>  arch/x86/kvm/vmx/vmx.h            |   2 +
>  arch/x86/kvm/x86.c                |  47 ++--
>  include/linux/perf_event.h        |  18 ++
>  include/uapi/linux/kvm.h          |   1 +
>  kernel/events/core.c              |  19 +-
>  18 files changed, 738 insertions(+), 61 deletions(-)
> 
> --
> 2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ