lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 6 Apr 2023 15:10:53 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     kan.liang@...ux.intel.com
Cc:     mingo@...hat.com, linux-kernel@...r.kernel.org, ak@...ux.intel.com,
        eranian@...gle.com
Subject: Re: [PATCH 1/2] perf: Add sched_task callback during ctx reschedule

On Tue, Mar 28, 2023 at 03:27:34PM -0700, kan.liang@...ux.intel.com wrote:
> From: Kan Liang <kan.liang@...ux.intel.com>
> 
> Several similar kernel warnings can be triggered,
> 
>   [56605.607840] CPU0 PEBS record size 0, expected 32, config 0
>   cpuc->record_size=208
> 
> when the below commands are running in parallel for a while on SPR.
> 
>   while true; do perf record --no-buildid -a --intr-regs=AX -e
>   cpu/event=0xd0,umask=0x81/pp -c 10003 -o /dev/null ./triad; done &
> 
>   while true; do perf record -o /tmp/out -W -d -e
>   '{ld_blocks.store_forward:period=1000000,
>   MEM_TRANS_RETIRED.LOAD_LATENCY:u:precise=2:ldlat=4}'
>   -c 1037 ./triad; done
>   *The triad program is just the generation of loads/stores.
> 
> The warnings are triggered when an unexpected PEBS record (with a
> different config and size) is found.
> 
> In a context switch, different events may be applied to the old task and
> the new task. The sched_task callback is used to switch the PMU-specific
> data. For the PEBS, the callback flushes the DS area and parses the PEBS
> records from the old task when schedule out. The new task never sees the
> stale PEBS records.
> 
> However, the exec() doesn't have similar support. The new command may
> see the PEBS records from the old command. In the perf_event_exec(), the
> ctx_resched() is invoked to reschedule the context. It may updates the
> active events, which may change the global PEBS configuration. The PEBS
> record from the old command may have a different config and size from
> the new command. The warning is triggered.
> 
> The sched_task callback should be invoked in all the places where the
> context can be changed. Add the sched_task callback in the ctx_resched()
> as well.
> 
> Reported-by: Stephane Eranian <eranian@...gle.com>
> Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
> ---
>  kernel/events/core.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index f79fd8b87f75..0c49183656fd 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2642,6 +2642,8 @@ static void perf_event_sched_in(struct perf_cpu_context *cpuctx,
>  		 ctx_sched_in(ctx, EVENT_FLEXIBLE);
>  }
>  
> +static void perf_ctx_sched_task_cb(struct perf_event_context *ctx, bool sched_in);
> +
>  /*
>   * We want to maintain the following priority of scheduling:
>   *  - CPU pinned (EVENT_CPU | EVENT_PINNED)
> @@ -2680,6 +2682,7 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
>  	perf_ctx_disable(&cpuctx->ctx);
>  	if (task_ctx) {
>  		perf_ctx_disable(task_ctx);
> +		perf_ctx_sched_task_cb(task_ctx, false);
>  		task_ctx_sched_out(task_ctx, event_type);
>  	}
>  
> @@ -2698,8 +2701,10 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
>  	perf_event_sched_in(cpuctx, task_ctx);
>  
>  	perf_ctx_enable(&cpuctx->ctx);
> -	if (task_ctx)
> +	if (task_ctx) {
> +		perf_ctx_sched_task_cb(task_ctx, true);
>  		perf_ctx_enable(task_ctx);
> +	}
>  }
>  

Urgh... yuck.. Also I think you missed perf_rotate_context() which has
an open coded resched.

But I think this is a very fragile approach. Why can't the x86 pmu/pebs
driver not flush when it's programming changes -- it, as no other, knows
when this happens.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ