lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fXAhsDkQXVCvb5xrjLVpYG-typR23_NXwgwPDjaJS0pqg@mail.gmail.com>
Date: Tue, 18 Mar 2025 09:37:02 -0700
From: Ian Rogers <irogers@...gle.com>
To: Thomas Richter <tmricht@...ux.ibm.com>
Cc: linux-kernel@...r.kernel.org, linux-s390@...r.kernel.org, 
	linux-perf-users@...r.kernel.org, acme@...nel.org, namhyung@...nel.org, 
	acme@...hat.com, agordeev@...ux.ibm.com, gor@...ux.ibm.com, 
	sumanthk@...ux.ibm.com, hca@...ux.ibm.com
Subject: Re: [PATCH 2/2] perf/bench: Double free of dynamic allocated memory

On Tue, Mar 18, 2025 at 2:52 AM Thomas Richter <tmricht@...ux.ibm.com> wrote:
>
> On s390 z/VM the command 'perf bench internals pmu-scan'
> dumps core, as can be seen:
>
> Output before:
>  # ./perf bench internals pmu-scan
>  # Running 'internals/pmu-scan' benchmark:
>  Computing performance of sysfs PMU event scan for 100 times
>  perf: /root/linux/tools/include/linux/refcount.h:131:
>         refcount_sub_and_test: Assertion `!(new > val)' failed.
>  Aborted (core dumped)
>  #
>
> The root cause is in
>
> perf_pmus__scan()
> +--> perf_pmu__create_placeholder_core_pmu()
>      +--> cpu_map__online()
>
> cpu_map__online() has a static variable
>
>     static struct perf_cpu_map *online;
>
>     if (!online)
>         online = perf_cpu_map__new_online_cpus();
>
>     return online;
>
> which is allocated once when entered for the first time.
>
> However perf_pmu__create_placeholder_core_pmu() is actually called
> two times.
> First time:
>    run_pmu_scan()
>    +--> save_result()
>         +---> perf_pmus__scan_core()
>               +--> pmu_read_sysfs()
>                    +--> perf_pmu__create_placeholder_core_pmu()
>         ...
>         +--> perf_pmus__destroy()
>
> Second time:
>     run_pmu_scan()
>     +--> perf_pmus__scan()
>          +--> pmu_read_sysfs()
>               +--> perf_pmu__create_placeholder_core_pmu()
>         ...
>         +--> perf_pmus__destroy()
>
> The second time the already allocated memory pointed to by variable
> 'online' is returned. However in between the first and second call
> of perf_pmu__create_placeholder_core_pmu()
> function save_result() also frees all PMUs:
>
> save_result()
> +--> perf_pmus__destroy()
>      +--> perf_pmu__delete()
>           +--> perf_cpu_map__put()
>                +--> cpu_map__delete()
>
> cpu_map__delete() deletes the perf_cpu_map pointed to by variable
> online, but this static variable is not set to NULL. In the second
> invocation of perf_pmu__create_placeholder_core_pmu() the same
> memory locattion stored in variable online is returned.
>
> Later on run_pmu_scan() calls perf_pmus__destroy() again and then
> cpu_map__delete() frees the PMU "cpu->cpus" a second time causing
> the core dump.
>
> Avoid core dump and always allocate the online CPUs.
>
> Output after:
>  # ./perf bench internals pmu-scan
>  # Running 'internals/pmu-scan' benchmark:
>  Computing performance of sysfs PMU event scan for 100 times
>   Average core PMU scanning took: 7.970 usec (+- 0.147 usec)
>   Average PMU scanning took: 60.415 usec (+- 3.986 usec)
>  #
>
> Background: s390 z/VM system do not support PMUs for sampling and
> counting. In this case dummy events are created by the perf tool
> and the PMUs "tool" and "fake" are created and freed.
>
> Fixes: a0c41caebab2f ("perf pmu: Add CPU map for "cpu" PMUs")
> Signed-off-by: Thomas Richter <tmricht@...ux.ibm.com>
> Cc: Ian Rogers <irogers@...gle.com>

Thanks Thomas, the proposed fix breaks address sanitizer as there are
missing puts on the freshly allocated cpu maps. I think a better fix
would be to "get" the cpu map to increment the reference count rather
than to keep re-reading the online CPUs. In either case we need the
puts on the user side.

Ian

> ---
>  tools/perf/util/cpumap.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 5c329ad614e9..ab9e7a266af9 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -691,12 +691,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
>
>  struct perf_cpu_map *cpu_map__online(void) /* thread unsafe */
>  {
> -       static struct perf_cpu_map *online;
> -
> -       if (!online)
> -               online = perf_cpu_map__new_online_cpus(); /* from /sys/devices/system/cpu/online */
> -
> -       return online;
> +       return perf_cpu_map__new_online_cpus(); /* from /sys/devices/system/cpu/online */
>  }
>
>  bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b)
> --
> 2.48.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ