lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f4e4bb1-ba5e-4e5f-b6e3-e7603e3d6b0e@amd.com>
Date: Mon, 10 Feb 2025 12:12:40 +0530
From: Ravi Bangoria <ravi.bangoria@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...nel.org, lucas.demarchi@...el.com, linux-kernel@...r.kernel.org,
 willy@...radead.org, acme@...nel.org, namhyung@...nel.org,
 mark.rutland@....com, alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
 irogers@...gle.com, adrian.hunter@...el.com, kan.liang@...ux.intel.com,
 Ravi Bangoria <ravi.bangoria@....com>
Subject: Re: [PATCH v2 24/24] perf: Make perf_pmu_unregister() useable

> @@ -13467,7 +13665,13 @@ perf_event_exit_event(struct perf_event
>  		 * Kick perf_poll() for is_event_hup();
>  		 */
>  		perf_event_wakeup(parent_event);
> -		free_event(event);
> +		/*
> +		 * pmu_detach_event() will have an extra refcount.
> +		 */
> +		if (revoke)
> +			put_event(event);
> +		else
> +			free_event(event);
>  		put_event(parent_event);
>  		return;
>  	}

There is a race between do_exit() and perf_pmu_unregister():

Assume: a Parent process with 'event1' and a Child process with an
inherited 'event2'.

                  CPU 1                                                       CPU 2

      Child process exiting ...
   1  do_exit()
   2    perf_event_exit_task()
   3      perf_event_exit_task_context()
   4        mutex_lock(&child_ctx->mutex);
   5        list_for_each_entry_safe(&child_ctx->event_list)
   6        /* picks event2. event2->refcount is 1 */
   7                                                             perf_pmu_unregister()
   8                                                               pmu_detach_events()
   9                                                                 pmu_get_event(pmu) /* Picks event2 */
  10                                                                   atomic_long_inc_not_zero(&event->refcount) /* event2 */
  11                                                                   /* event2->refcount becomes 2 */
  12                                                                 pmu_detach_event() /* event2 */
  13                                                                   perf_event_ctx_lock(); /* event2 */
  14        /* event2->refcount became 2 (by CPU 2) */                 .
  15      perf_event_exit_event() /* event2 */                         . /* CPU 1 is holding ctx->mutex of
  16        if (parent_event) { /* true */                             .    child process. Waiting ... */
  17          if (revoke) /* false */                                  .
  18          else                                                     .
  19              free_event();  /* event2 */                          .
  20                  WARN()                                           .

                      ^^^^^^

This race manifests as:

  unexpected event refcount: 2; ptr=00000000c6ca2049
  WARNING: CPU: 57 PID: 9729 at kernel/events/core.c:5439 free_event+0x48/0x60
  RIP: 0010:free_event+0x48/0x60
  Call Trace:
   ? free_event+0x48/0x60
   perf_event_exit_event.isra.0+0x60/0xf0
   perf_event_exit_task+0x1e1/0x280
   do_exit+0x327/0xb00

The race continues... now, with the stale child2->parent pointer:

                  CPU 1                                                       CPU 2

  15      perf_event_exit_event() /* event2 */                         . /* CPU 1 is holding ctx->mutex of
  16        if (parent_event) { /* true */                             .    child process. Waiting ... */
  17          if (revoke) /* false */                                  .
  18          else                                                     .
  19              free_event();  /* event2 */                          .
  20                  WARN()                                           .
  21          put_event(parent_event); /* event1 */                    .
  22          /* event1->refcount becomes 1 */                         .
  23        }                                                          .
  24        +- mutex_unlock(&child_ctx->mutex);                        . /* Acquired child's ctx->mutex */
  25                                                                   __pmu_detach_event() /* event2 */
  26                                                                     perf_event_exit_event() /* event2 */
  27                                                                       if (parent_event) { /* true, BUT FALSE POSITIVE. */
  28                                                                         if (revoke) /* true */
  29                                                                             put_event(); /* event2 */
  30                                                                             /* event2->refcount becomes 1 */
  31                                                                         put_event(parent_event); /* event1 */
  32                                                                         /* event1->refcount becomes 0 */

                                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^

This can manifest as some random bug. I guess, perf_event_exit_event()
should also check for (event->attach_state & PERF_ATTACH_CHILD) when
event->parent != NULL.

Thanks,
Ravi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ