lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALMp9eT48THXwEG23Kb0-QExyA8qZAtkXxrxc+6+pdvtvVVN0A@mail.gmail.com>
Date:   Fri, 16 Jul 2021 14:07:08 -0700
From:   Jim Mattson <jmattson@...gle.com>
To:     "Liang, Kan" <kan.liang@...ux.intel.com>
Cc:     Zhu Lingshan <lingshan.zhu@...el.com>, peterz@...radead.org,
        pbonzini@...hat.com, bp@...en8.de, seanjc@...gle.com,
        vkuznets@...hat.com, wanpengli@...cent.com, joro@...tes.org,
        ak@...ux.intel.com, wei.w.wang@...el.com, eranian@...gle.com,
        liuxiangdong5@...wei.com, linux-kernel@...r.kernel.org,
        x86@...nel.org, kvm@...r.kernel.org, like.xu.linux@...il.com,
        boris.ostrvsky@...cle.com
Subject: Re: [PATCH V8 00/18] KVM: x86/pmu: Add *basic* support to enable
 guest PEBS via DS

On Fri, Jul 16, 2021 at 12:00 PM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>
>
>
> On 7/16/2021 1:02 PM, Jim Mattson wrote:
> > On Fri, Jul 16, 2021 at 1:54 AM Zhu Lingshan <lingshan.zhu@...el.com> wrote:
> >>
> >> The guest Precise Event Based Sampling (PEBS) feature can provide an
> >> architectural state of the instruction executed after the guest instruction
> >> that exactly caused the event. It needs new hardware facility only available
> >> on Intel Ice Lake Server platforms. This patch set enables the basic PEBS
> >> feature for KVM guests on ICX.
> >>
> >> We can use PEBS feature on the Linux guest like native:
> >>
> >>     # echo 0 > /proc/sys/kernel/watchdog (on the host)
> >>     # perf record -e instructions:ppp ./br_instr a
> >>     # perf record -c 100000 -e instructions:pp ./br_instr a
> >>
> >> To emulate guest PEBS facility for the above perf usages,
> >> we need to implement 2 code paths:
> >>
> >> 1) Fast path
> >>
> >> This is when the host assigned physical PMC has an identical index as the
> >> virtual PMC (e.g. using physical PMC0 to emulate virtual PMC0).
> >> This path is used in most common use cases.
> >>
> >> 2) Slow path
> >>
> >> This is when the host assigned physical PMC has a different index from the
> >> virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) In this case,
> >> KVM needs to rewrite the PEBS records to change the applicable counter indexes
> >> to the virtual PMC indexes, which would otherwise contain the physical counter
> >> index written by PEBS facility, and switch the counter reset values to the
> >> offset corresponding to the physical counter indexes in the DS data structure.
> >>
> >> The previous version [0] enables both fast path and slow path, which seems
> >> a bit more complex as the first step. In this patchset, we want to start with
> >> the fast path to get the basic guest PEBS enabled while keeping the slow path
> >> disabled. More focused discussion on the slow path [1] is planned to be put to
> >> another patchset in the next step.
> >>
> >> Compared to later versions in subsequent steps, the functionality to support
> >> host-guest PEBS both enabled and the functionality to emulate guest PEBS when
> >> the counter is cross-mapped are missing in this patch set
> >> (neither of these are typical scenarios).
> >
> > I'm not sure exactly what scenarios you're ruling out here. In our
> > environment, we always have to be able to support host-level
> > profiling, whether or not the guest is using the PMU (for PEBS or
> > anything else). Hence, for our *basic* vPMU offering, we only expose
> > two general purpose counters to the guest, so that we can keep two
> > general purpose counters for the host. In this scenario, I would
> > expect cross-mapped counters to be common. Are we going to be able to
> > use this implementation?
> >
>
> Let's say we have 4 GP counters in HW.
> Do you mean that the host owns 2 GP counters (counter 0 & 1) and the
> guest own the other 2 GP counters (counter 2 & 3) in your envirinment?
> We did a similar implementation in V1, but the proposal has been denied.
> https://lore.kernel.org/kvm/20200306135317.GD12561@hirez.programming.kicks-ass.net/

It's the other way around. AFAIK, there is no architectural way to
specify that only counters 2 and 3 are available, so we have to give
the guest counters 0 and 1.

> For the current proposal, both guest and host can see all 4 GP counters.
> The counters are shared.

I don't understand how that can work. If the host programs two
counters, how can you give the guest four counters?

> The guest cannot know the availability of the counters. It may requires
> a counter (e.g., counter 0) which may has been used by the host. Host
> may provides another counter (e.g., counter 1) to the guest. This is the
> case described in the slow path. For this case, we have to modify the
> guest PEBS record. Because the counter index in the PEBS record is 1,
> while the guest perf driver expects 0.

If we reserve counters 0 and 1 for the guest, this is not a problem
(assuming we tell the guest it only has two counters). If we don't
statically partition the counters, I don't see how you can ensure that
the guest behaves as architected. For example, what do you do when the
guest programs four counters and the host programs two?

> If counter 0 is available, guests can use counter 0. That's the fast
> path. I think the fast path should be more common even both host and
> guest are profiling. Because except for some specific events, we may
> move the host event to the counters which are not required by guest if
> we have enough resources.

And if you don't have enough resources?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ