[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <gsnt4iww3406.fsf@coltonlewis-kvm.c.googlers.com>
Date: Tue, 03 Jun 2025 21:32:41 +0000
From: Colton Lewis <coltonlewis@...gle.com>
To: Oliver Upton <oliver.upton@...ux.dev>
Cc: kvm@...r.kernel.org, pbonzini@...hat.com, corbet@....net,
linux@...linux.org.uk, catalin.marinas@....com, will@...nel.org,
maz@...nel.org, joey.gouly@....com, suzuki.poulose@....com,
yuzenghui@...wei.com, mark.rutland@....com, shuah@...nel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev,
linux-perf-users@...r.kernel.org, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH 06/17] KVM: arm64: Introduce method to partition the PMU
Oliver Upton <oliver.upton@...ux.dev> writes:
> On Mon, Jun 02, 2025 at 07:26:51PM +0000, Colton Lewis wrote:
>> static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu)
>> {
>> + u8 hpmn = vcpu->kvm->arch.arm_pmu->hpmn;
>> +
>> preempt_disable();
>> /*
>> * This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
>> * to disable guest access to the profiling and trace buffers
>> */
>> - vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN,
>> - *host_data_ptr(nr_event_counters));
>> - vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
>> + vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN, hpmn);
>> + vcpu->arch.mdcr_el2 |= (MDCR_EL2_HPMD |
>> + MDCR_EL2_TPM |
> This isn't safe, as there's no guarantee that kvm_arch::arm_pmu is
> pointing that the PMU for this CPU. KVM needs to derive HPMN from some
> per-CPU state, not anything tied to the VM/vCPU.
I'm confused. Isn't this function preparing to run the vCPU on this
CPU? Why would it be pointing at a different PMU?
And HPMN is something that we only want set when running a vCPU, so
there isn't any per-CPU state saying it should be anything but the
default value (number of counters) outside that context.
Unless you just mean I should check the number of counters again and
make sure HPMN is not an invalid value.
>> +/**
>> + * kvm_pmu_partition() - Partition the PMU
>> + * @pmu: Pointer to pmu being partitioned
>> + * @host_counters: Number of host counters to reserve
>> + *
>> + * Partition the given PMU by taking a number of host counters to
>> + * reserve and, if it is a valid reservation, recording the
>> + * corresponding HPMN value in the hpmn field of the PMU and clearing
>> + * the guest-reserved counters from the counter mask.
>> + *
>> + * Passing 0 for @host_counters has the effect of disabling
>> partitioning.
>> + *
>> + * Return: 0 on success, -ERROR otherwise
>> + */
>> +int kvm_pmu_partition(struct arm_pmu *pmu, u8 host_counters)
>> +{
>> + u8 nr_counters;
>> + u8 hpmn;
>> +
>> + if (!kvm_pmu_reservation_is_valid(host_counters))
>> + return -EINVAL;
>> +
>> + nr_counters = *host_data_ptr(nr_event_counters);
>> + hpmn = kvm_pmu_hpmn(host_counters);
>> +
>> + if (hpmn < nr_counters) {
>> + pmu->hpmn = hpmn;
>> + /* Inform host driver of available counters */
>> + bitmap_clear(pmu->cntr_mask, 0, hpmn);
>> + bitmap_set(pmu->cntr_mask, hpmn, nr_counters);
>> + clear_bit(ARMV8_PMU_CYCLE_IDX, pmu->cntr_mask);
>> + if (pmuv3_has_icntr())
>> + clear_bit(ARMV8_PMU_INSTR_IDX, pmu->cntr_mask);
>> +
>> + kvm_debug("Partitioned PMU with HPMN %u", hpmn);
>> + } else {
>> + pmu->hpmn = nr_counters;
>> + bitmap_set(pmu->cntr_mask, 0, nr_counters);
>> + set_bit(ARMV8_PMU_CYCLE_IDX, pmu->cntr_mask);
>> + if (pmuv3_has_icntr())
>> + set_bit(ARMV8_PMU_INSTR_IDX, pmu->cntr_mask);
>> +
>> + kvm_debug("Unpartitioned PMU");
>> + }
>> +
>> + return 0;
>> +}
> Hmm... Just in terms of code organization I'm not sure I like having KVM
> twiddling with *host* support for PMUv3. Feels like the ARM PMU driver
> should own partitioning and KVM just takes what it can get.
Okay. I can move the code.
>> @@ -239,6 +245,13 @@ void kvm_host_pmu_init(struct arm_pmu *pmu)
>> if (!pmuv3_implemented(kvm_arm_pmu_get_pmuver_limit()))
>> return;
>> + if (reserved_host_counters) {
>> + if (kvm_pmu_partition_supported())
>> + WARN_ON(kvm_pmu_partition(pmu, reserved_host_counters));
>> + else
>> + kvm_err("PMU Partition is not supported");
>> + }
>> +
> Hasn't the ARM PMU been registered with perf at this point? Surely the
> driver wouldn't be very pleased with us ripping counters out from under
> its feet.
AFAICT nothing in perf registration cares about the number of counters
the PMU has. The PMUv3 driver tracks its own available counters through
cntr_mask and I modify that during partition.
Since this is still initialization of the PMU, I don't believe anything
has had a chance to use a counter yet that will be ripped away.
Aesthetically It makes since to change this if I move the partitioning
code to the PMUv3 driver, but I think it's inconsequential to the
function.
Powered by blists - more mailing lists