[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211213203016.GB16608@worktop.programming.kicks-ass.net>
Date: Mon, 13 Dec 2021 21:30:16 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Ingo Molnar <mingo@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>,
Stephane Eranian <eranian@...gle.com>,
Andi Kleen <ak@...ux.intel.com>,
Ian Rogers <irogers@...gle.com>,
kernel test robot <lkp@...el.com>,
Marco Elver <elver@...gle.com>
Subject: Re: [PATCH v2] perf/core: Fix cgroup event list management
On Sun, Dec 12, 2021 at 10:59:36PM -0800, Namhyung Kim wrote:
> The active cgroup events are managed in the per-cpu cgrp_cpuctx_list.
> This list is accessed from current cpu and not protected by any locks.
> But from the commit ef54c1a476ae ("perf: Rework
> perf_event_exit_event()"), this assumption does not hold true anymore.
>
> In the perf_remove_from_context(), it can remove an event from the
> context without an IPI when the context is not active. I think it
"I tihnk" just doesn't cut it. That means I have to completely reverse
engineer your patch and it's assumptions. Which is more work for me :-(
> assumes task event context, but it's possible for cpu event context
> only with cgroup events can be inactive at the moment - and it might
> become active soon.
>
> If the event is enabled when it's about to be closed, it might call
> perf_cgroup_event_disable() and list_del() with the cgrp_cpuctx_list
> on a different cpu.
>
> This resulted in a crash due to an invalid list pointer access during
> the cgroup list traversal on the cpu which the event belongs to.
>
> The following program can crash my box easily..
Unless that's already public, you've just given the script kiddos ammo,
surely we don't need that.
> Let's use IPI to prevent such crashes.
Let's just not do random things and hope stuff 'works'. Either it is
correct or it is not.
> Similarly, I think perf_install_in_context() should use IPI for the
> cgroup events too.
Let's be sure, ok?
> Reported-by: kernel test robot <lkp@...el.com> # for build error
That's complete garbage, please don't do that.
> Cc: Marco Elver <elver@...gle.com>
> Signed-off-by: Namhyung Kim <namhyung@...nel.org>
> ---
> v2) simply use IPI for cgroup events
>
> kernel/events/core.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 30d94f68c5bd..9460c083acd9 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2388,7 +2388,7 @@ static void perf_remove_from_context(struct perf_event *event, unsigned long fla
> * event_function_call() user.
> */
> raw_spin_lock_irq(&ctx->lock);
> - if (!ctx->is_active) {
> + if (!ctx->is_active && !is_cgroup_event(event)) {
> __perf_remove_from_context(event, __get_cpu_context(ctx),
> ctx, (void *)flags);
> raw_spin_unlock_irq(&ctx->lock);
> @@ -2857,11 +2857,14 @@ perf_install_in_context(struct perf_event_context *ctx,
> * perf_event_attr::disabled events will not run and can be initialized
> * without IPI. Except when this is the first event for the context, in
> * that case we need the magic of the IPI to set ctx->is_active.
> + * Similarly, cgroup events for the context also needs the IPI to
> + * manipulate the cgrp_cpuctx_list.
> *
> * The IOC_ENABLE that is sure to follow the creation of a disabled
> * event will issue the IPI and reprogram the hardware.
> */
> - if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF && ctx->nr_events) {
> + if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF &&
> + ctx->nr_events && !is_cgroup_event(event)) {
> raw_spin_lock_irq(&ctx->lock);
> if (ctx->task == TASK_TOMBSTONE) {
> raw_spin_unlock_irq(&ctx->lock);
>
> base-commit: 73743c3b092277febbf69b250ce8ebbca0525aa2
What's junk like that doing ?
Powered by blists - more mailing lists