linux-kernel - Re: [RFC PATCH 23/41] KVM: x86/pmu: Implement the save/restore of PMU state for Intel CPU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d980dd10-e4c4-4774-b107-77b320cec9f9@linux.intel.com>
Date: Tue, 23 Apr 2024 10:44:27 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: maobibo <maobibo@...ngson.cn>, Sean Christopherson <seanjc@...gle.com>
Cc: Mingwei Zhang <mizhang@...gle.com>,
 Xiong Zhang <xiong.y.zhang@...ux.intel.com>, pbonzini@...hat.com,
 peterz@...radead.org, kan.liang@...el.com, zhenyuw@...ux.intel.com,
 jmattson@...gle.com, kvm@...r.kernel.org, linux-perf-users@...r.kernel.org,
 linux-kernel@...r.kernel.org, zhiyuan.lv@...el.com, eranian@...gle.com,
 irogers@...gle.com, samantha.alt@...el.com, like.xu.linux@...il.com,
 chao.gao@...el.com
Subject: Re: [RFC PATCH 23/41] KVM: x86/pmu: Implement the save/restore of PMU
 state for Intel CPU


On 4/23/2024 9:01 AM, maobibo wrote:
>
>
> On 2024/4/23 上午1:01, Sean Christopherson wrote:
>> On Mon, Apr 22, 2024, maobibo wrote:
>>> On 2024/4/16 上午6:45, Sean Christopherson wrote:
>>>> On Mon, Apr 15, 2024, Mingwei Zhang wrote:
>>>>> On Mon, Apr 15, 2024 at 10:38 AM Sean Christopherson 
>>>>> <seanjc@...gle.com> wrote:
>>>>>> One my biggest complaints with the current vPMU code is that the 
>>>>>> roles and
>>>>>> responsibilities between KVM and perf are poorly defined, which 
>>>>>> leads to suboptimal
>>>>>> and hard to maintain code.
>>>>>>
>>>>>> Case in point, I'm pretty sure leaving guest values in PMCs 
>>>>>> _would_ leak guest
>>>>>> state to userspace processes that have RDPMC permissions, as the 
>>>>>> PMCs might not
>>>>>> be dirty from perf's perspective (see perf_clear_dirty_counters()).
>>>>>>
>>>>>> Blindly clearing PMCs in KVM "solves" that problem, but in doing 
>>>>>> so makes the
>>>>>> overall code brittle because it's not clear whether KVM _needs_ 
>>>>>> to clear PMCs,
>>>>>> or if KVM is just being paranoid.
>>>>>
>>>>> So once this rolls out, perf and vPMU are clients directly to PMU HW.
>>>>
>>>> I don't think this is a statement we want to make, as it opens a 
>>>> discussion
>>>> that we won't win.  Nor do I think it's one we *need* to make.  KVM 
>>>> doesn't need
>>>> to be on equal footing with perf in terms of owning/managing PMU 
>>>> hardware, KVM
>>>> just needs a few APIs to allow faithfully and accurately 
>>>> virtualizing a guest PMU.
>>>>
>>>>> Faithful cleaning (blind cleaning) has to be the baseline
>>>>> implementation, until both clients agree to a "deal" between them.
>>>>> Currently, there is no such deal, but I believe we could have one via
>>>>> future discussion.
>>>>
>>>> What I am saying is that there needs to be a "deal" in place before 
>>>> this code
>>>> is merged.  It doesn't need to be anything fancy, e.g. perf can 
>>>> still pave over
>>>> PMCs it doesn't immediately load, as opposed to using 
>>>> cpu_hw_events.dirty to lazily
>>>> do the clearing.  But perf and KVM need to work together from the 
>>>> get go, ie. I
>>>> don't want KVM doing something without regard to what perf does, 
>>>> and vice versa.
>>>>
>>> There is similar issue on LoongArch vPMU where vm can directly pmu 
>>> hardware
>>> and pmu hw is shard with guest and host. Besides context switch 
>>> there are
>>> other places where perf core will access pmu hw, such as tick
>>> timer/hrtimer/ipi function call, and KVM can only intercept context 
>>> switch.
>>
>> Two questions:
>>
>>   1) Can KVM prevent the guest from accessing the PMU?
>>
>>   2) If so, KVM can grant partial access to the PMU, or is it all or 
>> nothing?
>>
>> If the answer to both questions is "yes", then it sounds like 
>> LoongArch *requires*
>> mediated/passthrough support in order to virtualize its PMU.
>
> Hi Sean,
>
> Thank for your quick response.
>
> yes, kvm can prevent guest from accessing the PMU and grant partial or 
> all to access to the PMU. Only that if one pmu event is granted to VM, 
> host can not access this pmu event again. There must be pmu event 
> switch if host want to.

PMU event is a software entity which won't be shared. did you mean if a 
PMU HW counter is granted to VM, then Host can't access the PMU HW 
counter, right?


>
>>
>>> Can we add callback handler in structure kvm_guest_cbs?  just like 
>>> this:
>>> @@ -6403,6 +6403,7 @@ static struct perf_guest_info_callbacks 
>>> kvm_guest_cbs
>>> = {
>>>          .state                  = kvm_guest_state,
>>>          .get_ip                 = kvm_guest_get_ip,
>>>          .handle_intel_pt_intr   = NULL,
>>> +       .lose_pmu               = kvm_guest_lose_pmu,
>>>   };
>>>
>>> By the way, I do not know should the callback handler be triggered 
>>> in perf
>>> core or detailed pmu hw driver. From ARM pmu hw driver, it is 
>>> triggered in
>>> pmu hw driver such as function kvm_vcpu_pmu_resync_el0,
>>> but I think it will be better if it is done in perf core.
>>
>> I don't think we want to take the approach of perf and KVM guests 
>> "fighting" over
>> the PMU.  That's effectively what we have today, and it's a mess for 
>> KVM because
>> it's impossible to provide consistent, deterministic behavior for the 
>> guest.  And
>> it's just as messy for perf, which ends up having wierd, cumbersome 
>> flows that
>> exists purely to try to play nice with KVM.
> With existing pmu core code, in tick timer interrupt or IPI function 
> call interrupt pmu hw may be accessed by host when VM is running and 
> pmu is already granted to guest. KVM can not intercept host IPI/timer 
> interrupt, there is no pmu context switch, there will be problem.
>
> Regards
> Bibo Mao
>