linux-kernel - Re: [PATCH 0/4] perf: Fix the ctx->pmu for a hybrid system

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3d4b9377-30b0-a945-7b11-b412dcc4c51a@linux.intel.com>
Date:   Thu, 17 Jun 2021 10:10:37 -0400
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     mingo@...hat.com, linux-kernel@...r.kernel.org, acme@...nel.org,
        mark.rutland@....com, ak@...ux.intel.com,
        alexander.shishkin@...ux.intel.com, namhyung@...nel.org,
        jolsa@...hat.com
Subject: Re: [PATCH 0/4] perf: Fix the ctx->pmu for a hybrid system



On 6/17/2021 7:33 AM, Peter Zijlstra wrote:
> On Thu, Jun 17, 2021 at 12:23:06PM +0200, Peter Zijlstra wrote:
>> On Wed, Jun 16, 2021 at 11:55:30AM -0700, kan.liang@...ux.intel.com wrote:
>>
>>> To fix the issue, the generic perf codes have to understand the
>>> supported CPU mask of a specific hybrid PMU. So it can update the
>>> ctx->pmu accordingly, when a task is scheduled on a CPU which has
>>> a different type of PMU from the previous CPU. The supported_cpus
>>> has to be moved to the struct pmu.
>>
>> Urghh.. I so hate this :-/
>>
>> I *did* point you to:
>>
>>    https://lore.kernel.org/lkml/20181010104559.GO5728@hirez.programming.kicks-ass.net/
>>
>> when you started this whole hybrid crud

Yes, to work around the hybrid, I updated the PMU for the CPU context 
accordingly, but not the task context. :( This issue is found in a 
stress test that was not ready at that time. Sorry for that.

>>, and I think that's still the
>> correct thing to do.
>> >> Still, let me consider if there's a workable short-term cludge I hate
>> less.
> 
> How's this? We already have x86_pmu_update_cpu_context() setting the
> 'correct' pmu in the cpuctx, so we can simply fold that back into the
> task context.
> 
> For normal use this is a no-op.
> 
> Now I need to go audit all ctx->pmu usage :-(
> 
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index db4604c4c502..6a496c29ef00 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3822,9 +3822,16 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
>   					struct task_struct *task)
>   {
>   	struct perf_cpu_context *cpuctx;
> -	struct pmu *pmu = ctx->pmu;
> +	struct pmu *pmu;
>   
>   	cpuctx = __get_cpu_context(ctx);
> +
> +	/*
> +	 * HACK; for HETEROGENOUS the task context might have switched to a
> +	 * different PMU, don't bother gating this.
> +	 */
> +	pmu = ctx->pmu = cpuctx->ctx.pmu;
> +

I think all the perf_sw_context PMUs share the same pmu_cpu_context. so 
the cpuctx->ctx.pmu should be always the first registered 
perf_sw_context PMU which is perf_swevent. The ctx->pmu could be another 
software PMU.

In theory, the perf_sw_context PMUs should have a similar issue. If the 
events are from different perf_sw_context PMUs, we should 
perf_pmu_disable() all of the PMUs before schedule them, but the 
ctx->pmu only tracks the first one.

I don't have a good way to fix the perf_sw_context PMUs. I think we have 
to go through the event list and find all PMUs. But I don't think it's 
worth doing.

Maybe we should only apply the change for the hybrid PMUs, and leave 
other PMUs as is.

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 6fee4a7..df9cce6 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3821,9 +3821,19 @@ static void perf_event_context_sched_in(struct 
perf_event_context *ctx,
  					struct task_struct *task)
  {
  	struct perf_cpu_context *cpuctx;
-	struct pmu *pmu = ctx->pmu;
+	struct pmu *pmu;

  	cpuctx = __get_cpu_context(ctx);
+
+	if (ctx->pmu->capabilities & PERF_PMU_CAP_HETEROGENEOUS_CPUS) {
+		/*
+		 * HACK; for HETEROGENOUS the task context might have switched to a
+		 * different PMU, don't bother gating this.
+		 */
+		pmu = ctx->pmu = cpuctx->ctx.pmu;
+	} else
+		pmu = ctx->pmu;
+
  	if (cpuctx->task_ctx == ctx) {
  		if (cpuctx->sched_cb_usage)
  			__perf_pmu_sched_task(cpuctx, true);



Thanks,
Kan