linux-kernel - Re: [PATCH v4 09/38] perf: Add switch_guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aCUnq4M33yTj_t1F@google.com>
Date: Wed, 14 May 2025 16:30:51 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Mingwei Zhang <mizhang@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Paolo Bonzini <pbonzini@...hat.com>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>, Liang@...gle.com, 
	Kan <kan.liang@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org, 
	kvm@...r.kernel.org, linux-kselftest@...r.kernel.org, 
	Yongwei Ma <yongwei.ma@...el.com>, Xiong Zhang <xiong.y.zhang@...ux.intel.com>, 
	Dapeng Mi <dapeng1.mi@...ux.intel.com>, Jim Mattson <jmattson@...gle.com>, 
	Sandipan Das <sandipan.das@....com>, Zide Chen <zide.chen@...el.com>, 
	Eranian Stephane <eranian@...gle.com>, Shukla Manali <Manali.Shukla@....com>, 
	Nikunj Dadhania <nikunj.dadhania@....com>
Subject: Re: [PATCH v4 09/38] perf: Add switch_guest_ctx() interface

On Mon, Mar 24, 2025, Mingwei Zhang wrote:
> From: Kan Liang <kan.liang@...ux.intel.com>
> 
> When entering/exiting a guest, some contexts for a guest have to be
> switched. For examples, there is a dedicated interrupt vector for
> guests on Intel platforms.
> 
> When PMI switch into a new guest vector, guest_lvtpc value need to be
> reflected onto HW, e,g., guest clear PMI mask bit, the HW PMI mask
> bit should be cleared also, then PMI can be generated continuously
> for guest. So guest_lvtpc parameter is added into perf_guest_enter()
> and switch_guest_ctx().
> 
> Add a dedicated list to track all the pmus with the PASSTHROUGH cap, which
> may require switching the guest context. It can avoid going through the
> huge pmus list.
> 
> Suggested-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
> Signed-off-by: Mingwei Zhang <mizhang@...gle.com>
> ---
>  include/linux/perf_event.h | 17 +++++++++++--
>  kernel/events/core.c       | 51 +++++++++++++++++++++++++++++++++++++-
>  2 files changed, 65 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 37187ee8e226..58c1cf6939bf 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -584,6 +584,11 @@ struct pmu {
>  	 * Check period value for PERF_EVENT_IOC_PERIOD ioctl.
>  	 */
>  	int (*check_period)		(struct perf_event *event, u64 value); /* optional */
> +
> +	/*
> +	 * Switch guest context when a guest enter/exit, e.g., interrupt vectors.
> +	 */
> +	void (*switch_guest_ctx)	(bool enter, void *data); /* optional */

IMO, putting this in "struct pmu" is unnecessarily convoluted and complex, and a
poor fit for what needs to be done.  The only usage of the hook is for the CPU to
swap the LVTPC, and the @data payload communicates exactly that.  I.e. this has
one user, and can't reasonably be extended to other users without some ugliness.

And if by some miracle there's no CPU pmu in perf, KVM's mediated PMU still needs
to swap to its PMI IRQ.  So rather than per-PMU hook along with a list and a
spinlock, just make this an arch hook.  And if all of the mediated PMU code is
guarded by a Kconfig, then perf doesn't even needs __weak stubs.