lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <YjCLcpcX2peeQVCH@kernel.org>
Date:   Tue, 15 Mar 2022 09:49:54 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     Ian Rogers <irogers@...gle.com>
Cc:     Namhyung Kim <namhyung@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: 'perf stat --null sleep 1' segfault related to
 evlist__for_each_entry with affinity == NULL

This bounced yesterday, resending :-\

-----

Hi Ian,

	I just noticed this:

(gdb) run stat --null sleep 1
`/var/home/acme/bin/perf' has changed; re-reading symbols.
Starting program: /var/home/acme/bin/perf stat --null sleep 1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Detaching after fork from child process 900002]

Program received signal SIGSEGV, Segmentation fault.
0x0000000000596790 in create_perf_stat_counter (evsel=0xda9580, config=0xcbeac0 <stat_config>, target=0xcbe8a0 <target>, cpu_map_idx=0) at util/stat.c:548
548		if (leader->core.nr_members > 1)
Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-6.fc34.x86_64 cyrus-sasl-lib-2.1.27-8.fc34.x86_64 elfutils-debuginfod-client-0.186-1.fc34.x86_64 elfutils-libelf-0.186-1.fc34.x86_64 elfutils-libs-0.186-1.fc34.x86_64 glib2-2.68.4-1.fc34.x86_64 glibc-2.33-21.fc34.x86_64 keyutils-libs-1.6.1-2.fc34.x86_64 libbabeltrace-1.5.8-6.fc34.x86_64 libbrotli-1.0.9-4.fc34.x86_64 libcap-2.48-2.fc34.x86_64 libcom_err-1.45.6-5.fc34.x86_64 libcurl-7.76.1-12.fc34.x86_64 libgcc-11.2.1-9.fc34.x86_64 libidn2-2.3.2-1.fc34.x86_64 libnghttp2-1.43.0-2.fc34.x86_64 libpsl-0.21.1-3.fc34.x86_64 libssh-0.9.6-1.fc34.x86_64 libstdc++-11.2.1-9.fc34.x86_64 libunistring-0.9.10-10.fc34.x86_64 libunwind-1.4.0-5.fc34.x86_64 libuuid-2.36.2-1.fc34.x86_64 libxcrypt-4.4.28-1.fc34.x86_64 libzstd-1.5.2-1.fc34.x86_64 numactl-libs-2.0.14-3.fc34.x86_64 opencsd-1.1.1-1.fc34.x86_64 openldap-2.4.57-6.fc34.x86_64 openssl-libs-1.1.1l-2.fc34.x86_64 pcre-8.44-3.fc34.1.x86_64 perl-libs-5.32.1-477.fc34.x86_64 popt-1.18-4.fc34.x86_64 python3-libs-3.9.10-1.fc34.x86_64 slang-2.3.2-9.fc34.x86_64 xz-libs-5.2.5-5.fc34.x86_64
(gdb) bt
#0  0x0000000000596790 in create_perf_stat_counter (evsel=0xda9580, config=0xcbeac0 <stat_config>, target=0xcbe8a0 <target>, cpu_map_idx=0) at util/stat.c:548
#1  0x0000000000438c2c in __run_perf_stat (argc=2, argv=0x7fffffffe1f0, run_idx=0) at builtin-stat.c:835
#2  0x00000000004395d5 in run_perf_stat (argc=2, argv=0x7fffffffe1f0, run_idx=0) at builtin-stat.c:1047
#3  0x000000000043ce17 in cmd_stat (argc=2, argv=0x7fffffffe1f0) at builtin-stat.c:2561
#4  0x00000000004dc2d1 in run_builtin (p=0xcd5760 <commands+288>, argc=4, argv=0x7fffffffe1f0) at perf.c:313
#5  0x00000000004dc538 in handle_internal_command (argc=4, argv=0x7fffffffe1f0) at perf.c:365
#6  0x00000000004dc684 in run_argv (argcp=0x7fffffffe03c, argv=0x7fffffffe030) at perf.c:409
#7  0x00000000004dca50 in main (argc=4, argv=0x7fffffffe1f0) at perf.c:539
(gdb) p leader
$1 = (struct evsel *) 0x0
(gdb) fr 1
#1  0x0000000000438c2c in __run_perf_stat (argc=2, argv=0x7fffffffe1f0, run_idx=0) at builtin-stat.c:835
835			if (create_perf_stat_counter(counter, &stat_config, &target,
(gdb) p evsel_list
$2 = (struct evlist *) 0xda9580
(gdb) p evsel_list->core.nr_entries
$3 = 0
(gdb)

I've bisected it to a patch I added that avoids setting up affinity when
there is no need to do so, like when running 'perf stat sleep 1', this
one:

commit 49de179577e7b05b57f625bf05cdc60a72de38d0
Author: Arnaldo Carvalho de Melo <acme@...hat.com>
Date:   Mon Jan 17 13:09:29 2022 -0300

    perf stat: No need to setup affinities when starting a workload

    I.e. the simple:

      $ perf stat sleep 1

    Uses a dummy CPU map and thus there is no need to setup/cleanup
    affinities to avoid IPIs, etc.

    With this we're down to a sched_getaffinity() call, in the libnuma
    initialization, that probably can be removed in a followup patch.

    Acked-by: Ian Rogers <irogers@...gle.com>

The problem is that when using --null we end up with an empty evlist
and:

  evlist__for_each_cpu(evlist_cpu_itr, evsel_list, affinity)

in __run_perf_stat()

gets called with affinity == NULL.

Looking at its code it checks if affinity is NULL, etc, so I thought it
would be ok, but it ends up passing, IIRC, the empty list head:

(gdb) p evsel->core.node
$4 = {next = 0xda9580, prev = 0xda9580}
(gdb) p evsel
$5 = (struct evsel *) 0xda9580
(gdb) fr 1
#1  0x0000000000438c2c in __run_perf_stat (argc=2, argv=0x7fffffffe1f0, run_idx=0) at builtin-stat.c:835
835			if (create_perf_stat_counter(counter, &stat_config, &target,
(gdb) p evsel_list->core.entries
$6 = {next = 0xda9580, prev = 0xda9580}
(gdb)

I.e. it gets confused with its end condition when
evlist->core.nr_entries is zero.

Can you take a look at improving this? I'm going AFK now, so just wanted
to dump what I found so far.

I guess the problem got introduced in:

commit 472832d2c000b9611feaea66fe521055c3dbf17a
Author: Ian Rogers <irogers@...gle.com>
Date:   Tue Jan 4 22:13:37 2022 -0800

    perf evlist: Refactor evlist__for_each_cpu()

    Previously evlist__for_each_cpu() needed to iterate over the evlist in
    an inner loop and call "skip" routines. Refactor this so that the
    iteratr is smarter and the next function can update both the current CPU
    and evsel.

    By using a cpu map index, fix apparent off-by-1 in __run_perf_stat's
    call to perf_evsel__close_cpu().

-----------------

And was dormant as the affinity was being always setup, which we don't
want when it isn't needed.

I'm trying to use:

	perf trace --summary perf stat --null sleep 1

to see what is that perf is unconditionally doing that should be
deferred till it is really needed, for now I'll continue with:

[root@...e ~]# perf trace -s perf stat -e dummy sleep 1

 Performance counter stats for 'sleep 1':

                 0      dummy                                                       

       1.000945692 seconds time elapsed

       0.000000000 seconds user
       0.000861000 seconds sys



 Summary of events:

 perf (900261), 3127 events, 94.7%

   syscall            calls  errors  total       min       avg       max       stddev
                                     (msec)    (msec)    (msec)    (msec)        (%)
   --------------- --------  ------ -------- --------- --------- ---------     ------
   wait4                  1      0  1000.903  1000.903  1000.903  1000.903      0.00%
   openat               364    187     1.144     0.002     0.003     0.020      1.90%
   mmap                 263      0     0.796     0.001     0.003     0.015      2.60%
   clone                  1      0     0.620     0.620     0.620     0.620      0.00%
   read                 263      0     0.441     0.001     0.002     0.094     21.06%
   access                80     49     0.277     0.002     0.003     0.006      2.75%
   mprotect              80      0     0.269     0.002     0.003     0.006      3.15%
   newfstatat           174      1     0.210     0.001     0.001     0.007      3.57%
   close                183      0     0.171     0.001     0.001     0.003      1.64%
   getdents64            35      0     0.065     0.001     0.002     0.012     17.24%
   write                 18      0     0.052     0.001     0.003     0.009     15.24%
   brk                   12      0     0.034     0.001     0.003     0.005     11.27%
   ioctl                 29     29     0.029     0.001     0.001     0.003      8.38%
   futex                 20      0     0.026     0.001     0.001     0.002      8.30%
   perf_event_open        1      0     0.016     0.016     0.016     0.016      0.00%
   statfs                 3      0     0.011     0.002     0.004     0.006     38.92%
   rt_sigaction           8      0     0.009     0.001     0.001     0.002     12.76%
   pread64                6      0     0.006     0.001     0.001     0.001      1.44%
   pipe                   2      0     0.006     0.002     0.003     0.004     32.45%
   rt_sigprocmask         4      0     0.006     0.001     0.001     0.002     22.35%
   prctl                  6      4     0.005     0.001     0.001     0.001      6.86%
   munmap                 1      0     0.004     0.004     0.004     0.004      0.00%
   kill                   1      0     0.003     0.003     0.003     0.003      0.00%
   arch_prctl             2      1     0.002     0.001     0.001     0.001     11.96%
   sysinfo                1      0     0.002     0.002     0.002     0.002      0.00%
   sched_getaffinity      1      0     0.001     0.001     0.001     0.001      0.00%
   getpid                 1      0     0.001     0.001     0.001     0.001      0.00%
   fcntl                  1      0     0.001     0.001     0.001     0.001      0.00%
   set_tid_address        1      0     0.001     0.001     0.001     0.001      0.00%
   prlimit64              1      0     0.001     0.001     0.001     0.001      0.00%
   set_robust_list        1      0     0.001     0.001     0.001     0.001      0.00%
   execve                 1      0     0.000     0.000     0.000     0.000      0.00%


 sleep (900262), 107 events, 3.2%

   syscall            calls  errors  total       min       avg       max       stddev
                                     (msec)    (msec)    (msec)    (msec)        (%)
   --------------- --------  ------ -------- --------- --------- ---------     ------
   clock_nanosleep        1      0  1000.050  1000.050  1000.050  1000.050      0.00%
   execve                 8      7     0.282     0.008     0.035     0.198     66.26%
   read                   2      0     0.058     0.002     0.029     0.056     94.56%
   mmap                   9      0     0.027     0.002     0.003     0.005      9.92%
   mprotect               4      0     0.015     0.003     0.004     0.005     10.85%
   close                  8      0     0.013     0.001     0.002     0.003     13.67%
   prctl                  1      0     0.012     0.012     0.012     0.012      0.00%
   openat                 3      0     0.010     0.003     0.003     0.004      4.99%
   arch_prctl             2      1     0.006     0.001     0.003     0.005     68.42%
   brk                    3      0     0.006     0.001     0.002     0.002     21.26%
   access                 1      1     0.005     0.005     0.005     0.005      0.00%
   munmap                 1      0     0.005     0.005     0.005     0.005      0.00%
   newfstatat             3      0     0.005     0.001     0.002     0.002      8.55%
   pread64                4      0     0.005     0.001     0.001     0.001      5.39%
   rt_sigaction           1      0     0.002     0.002     0.002     0.002      0.00%
   fcntl                  1      0     0.002     0.002     0.002     0.002      0.00%
   set_robust_list        1      0     0.002     0.002     0.002     0.002      0.00%
   clone                  1      0     0.000     0.000     0.000     0.000      0.00%


[root@...e ~]#

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ