lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 9 Apr 2021 09:59:26 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     "Xu, Like" <like.xu@...el.com>
Cc:     Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>, eranian@...gle.com,
        andi@...stfloor.org, kan.liang@...ux.intel.com,
        wei.w.wang@...el.com, Wanpeng Li <wanpengli@...cent.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        x86@...nel.org, linux-kernel@...r.kernel.org,
        Andi Kleen <ak@...ux.intel.com>,
        Like Xu <like.xu@...ux.intel.com>
Subject: Re: [PATCH v4 08/16] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to
 manage guest DS buffer

On Fri, Apr 09, 2021 at 03:07:38PM +0800, Xu, Like wrote:
> Hi Peter,
> 
> On 2021/4/8 15:52, Peter Zijlstra wrote:
> > > This is because in the early part of this function, we have operations:
> > > 
> > >      if (x86_pmu.flags & PMU_FL_PEBS_ALL)
> > >          arr[0].guest &= ~cpuc->pebs_enabled;
> > >      else
> > >          arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);
> > > 
> > > and if guest has PEBS_ENABLED, we need these bits back for PEBS counters:
> > > 
> > >      arr[0].guest |= arr[1].guest;
> 
> > I don't think that's right, who's to say they were set in the first
> > place? The guest's GLOBAL_CTRL could have had the bits cleared at VMEXIT
> > time. You can't unconditionally add PEBS_ENABLED into GLOBAL_CTRL,
> > that's wrong.

> I can't keep up with you on this comment and would you explain more ?

Well, it could be I'm terminally confused on how virt works (I usually
am, it just doesn't make any sense ever).

On top of that this code doesn't have any comments to help.

So perf_guest_switch_msr has two msr values: guest and host.

In my naive understanding guest is the msr value the guest sees and host
is the value the host has. If it is not that, then the naming is just
misleading at best.

But thinking more about it, if these are fully emulated MSRs (which I
think they are), then there might actually be 3 different values, not 2.

We have the value the guest sees when it uses {RD,WR}MSR.
We have the value the hardware has when it runs a guest.
We have the value the hardware has when it doesn't run a guest.

And somehow this code does something, but I can't for the life of me
figure out what and how.

> To address your previous comments, does the code below look good to you?
> 
> static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
> {
>     struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>     struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
>     struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
>     struct kvm_pmu *pmu = (struct kvm_pmu *)data;
>     u64 pebs_mask = (x86_pmu.flags & PMU_FL_PEBS_ALL) ?
>             cpuc->pebs_enabled : (cpuc->pebs_enabled & PEBS_COUNTER_MASK);
>     int i = 0;
> 
>     arr[i].msr = MSR_CORE_PERF_GLOBAL_CTRL;
>     arr[i].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
>     arr[i].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
>     arr[i].guest &= ~pebs_mask;
> 
>     if (!x86_pmu.pebs)
>         goto out;
> 
>     /*
>      * If PMU counter has PEBS enabled it is not enough to
>      * disable counter on a guest entry since PEBS memory
>      * write can overshoot guest entry and corrupt guest
>      * memory. Disabling PEBS solves the problem.
>      *
>      * Don't do this if the CPU already enforces it.
>      */
>     if (x86_pmu.pebs_no_isolation) {
>         i++;
>         arr[i].msr = MSR_IA32_PEBS_ENABLE;
>         arr[i].host = cpuc->pebs_enabled;
>         arr[i].guest = 0;
>         goto out;
>     }
> 
>     if (!pmu || !x86_pmu.pebs_vmx)
>         goto out;
> 
>     i++;
>     arr[i].msr = MSR_IA32_DS_AREA;
>     arr[i].host = (unsigned long)ds;
>     arr[i].guest = pmu->ds_area;
> 
>     if (x86_pmu.intel_cap.pebs_baseline) {
>         i++;
>         arr[i].msr = MSR_PEBS_DATA_CFG;
>         arr[i].host = cpuc->pebs_data_cfg;
>         arr[i].guest = pmu->pebs_data_cfg;
>     }
> 
>     i++;
>     arr[i].msr = MSR_IA32_PEBS_ENABLE;
>     arr[i].host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask;
>     arr[i].guest = pebs_mask & ~cpuc->intel_ctrl_host_mask;
> 
>     if (arr[i].host) {
>         /* Disable guest PEBS if host PEBS is enabled. */
>         arr[i].guest = 0;
>     } else {
>         /* Disable guest PEBS for cross-mapped PEBS counters. */
>         arr[i].guest &= ~pmu->host_cross_mapped_mask;
>         arr[0].guest |= arr[i].guest;
>     }
> 
> out:
>     *nr = ++i;
>     return arr;
> }

The ++ is in a weird location, if you place it after filling out an
entry it makes more sense I think. Something like:

	arr[i].msr = MSR_CORE_PERF_GLOBAL_CTRL;
	arr[i].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
	arr[i].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
	arr[i].guest &= ~pebs_mask;
	i++;

or, perhaps even like:

	arr[i++] = (struct perf_guest_switch_msr){
		.msr = MSR_CORE_PERF_GLOBAL_CTRL,
		.host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
		.guest = x86_pmu.intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
	};

But it doesn't address the fundamental confusion I seem to be having,
what actual msr value is what.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ