[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d980dd10-e4c4-4774-b107-77b320cec9f9@linux.intel.com>
Date: Tue, 23 Apr 2024 10:44:27 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: maobibo <maobibo@...ngson.cn>, Sean Christopherson <seanjc@...gle.com>
Cc: Mingwei Zhang <mizhang@...gle.com>,
Xiong Zhang <xiong.y.zhang@...ux.intel.com>, pbonzini@...hat.com,
peterz@...radead.org, kan.liang@...el.com, zhenyuw@...ux.intel.com,
jmattson@...gle.com, kvm@...r.kernel.org, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, zhiyuan.lv@...el.com, eranian@...gle.com,
irogers@...gle.com, samantha.alt@...el.com, like.xu.linux@...il.com,
chao.gao@...el.com
Subject: Re: [RFC PATCH 23/41] KVM: x86/pmu: Implement the save/restore of PMU
state for Intel CPU
On 4/23/2024 9:01 AM, maobibo wrote:
>
>
> On 2024/4/23 上午1:01, Sean Christopherson wrote:
>> On Mon, Apr 22, 2024, maobibo wrote:
>>> On 2024/4/16 上午6:45, Sean Christopherson wrote:
>>>> On Mon, Apr 15, 2024, Mingwei Zhang wrote:
>>>>> On Mon, Apr 15, 2024 at 10:38 AM Sean Christopherson
>>>>> <seanjc@...gle.com> wrote:
>>>>>> One my biggest complaints with the current vPMU code is that the
>>>>>> roles and
>>>>>> responsibilities between KVM and perf are poorly defined, which
>>>>>> leads to suboptimal
>>>>>> and hard to maintain code.
>>>>>>
>>>>>> Case in point, I'm pretty sure leaving guest values in PMCs
>>>>>> _would_ leak guest
>>>>>> state to userspace processes that have RDPMC permissions, as the
>>>>>> PMCs might not
>>>>>> be dirty from perf's perspective (see perf_clear_dirty_counters()).
>>>>>>
>>>>>> Blindly clearing PMCs in KVM "solves" that problem, but in doing
>>>>>> so makes the
>>>>>> overall code brittle because it's not clear whether KVM _needs_
>>>>>> to clear PMCs,
>>>>>> or if KVM is just being paranoid.
>>>>>
>>>>> So once this rolls out, perf and vPMU are clients directly to PMU HW.
>>>>
>>>> I don't think this is a statement we want to make, as it opens a
>>>> discussion
>>>> that we won't win. Nor do I think it's one we *need* to make. KVM
>>>> doesn't need
>>>> to be on equal footing with perf in terms of owning/managing PMU
>>>> hardware, KVM
>>>> just needs a few APIs to allow faithfully and accurately
>>>> virtualizing a guest PMU.
>>>>
>>>>> Faithful cleaning (blind cleaning) has to be the baseline
>>>>> implementation, until both clients agree to a "deal" between them.
>>>>> Currently, there is no such deal, but I believe we could have one via
>>>>> future discussion.
>>>>
>>>> What I am saying is that there needs to be a "deal" in place before
>>>> this code
>>>> is merged. It doesn't need to be anything fancy, e.g. perf can
>>>> still pave over
>>>> PMCs it doesn't immediately load, as opposed to using
>>>> cpu_hw_events.dirty to lazily
>>>> do the clearing. But perf and KVM need to work together from the
>>>> get go, ie. I
>>>> don't want KVM doing something without regard to what perf does,
>>>> and vice versa.
>>>>
>>> There is similar issue on LoongArch vPMU where vm can directly pmu
>>> hardware
>>> and pmu hw is shard with guest and host. Besides context switch
>>> there are
>>> other places where perf core will access pmu hw, such as tick
>>> timer/hrtimer/ipi function call, and KVM can only intercept context
>>> switch.
>>
>> Two questions:
>>
>> 1) Can KVM prevent the guest from accessing the PMU?
>>
>> 2) If so, KVM can grant partial access to the PMU, or is it all or
>> nothing?
>>
>> If the answer to both questions is "yes", then it sounds like
>> LoongArch *requires*
>> mediated/passthrough support in order to virtualize its PMU.
>
> Hi Sean,
>
> Thank for your quick response.
>
> yes, kvm can prevent guest from accessing the PMU and grant partial or
> all to access to the PMU. Only that if one pmu event is granted to VM,
> host can not access this pmu event again. There must be pmu event
> switch if host want to.
PMU event is a software entity which won't be shared. did you mean if a
PMU HW counter is granted to VM, then Host can't access the PMU HW
counter, right?
>
>>
>>> Can we add callback handler in structure kvm_guest_cbs? just like
>>> this:
>>> @@ -6403,6 +6403,7 @@ static struct perf_guest_info_callbacks
>>> kvm_guest_cbs
>>> = {
>>> .state = kvm_guest_state,
>>> .get_ip = kvm_guest_get_ip,
>>> .handle_intel_pt_intr = NULL,
>>> + .lose_pmu = kvm_guest_lose_pmu,
>>> };
>>>
>>> By the way, I do not know should the callback handler be triggered
>>> in perf
>>> core or detailed pmu hw driver. From ARM pmu hw driver, it is
>>> triggered in
>>> pmu hw driver such as function kvm_vcpu_pmu_resync_el0,
>>> but I think it will be better if it is done in perf core.
>>
>> I don't think we want to take the approach of perf and KVM guests
>> "fighting" over
>> the PMU. That's effectively what we have today, and it's a mess for
>> KVM because
>> it's impossible to provide consistent, deterministic behavior for the
>> guest. And
>> it's just as messy for perf, which ends up having wierd, cumbersome
>> flows that
>> exists purely to try to play nice with KVM.
> With existing pmu core code, in tick timer interrupt or IPI function
> call interrupt pmu hw may be accessed by host when VM is running and
> pmu is already granted to guest. KVM can not intercept host IPI/timer
> interrupt, there is no pmu context switch, there will be problem.
>
> Regards
> Bibo Mao
>
Powered by blists - more mailing lists