linux-kernel - Re: [PATCH v2 0/7] Perf stat --null/offline CPU segv related fixes/tests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aTQRgAOpKyI53TEq@gmail.com>
Date: Sat, 6 Dec 2025 12:20:32 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	James Clark <james.clark@...aro.org>,
	Thomas Richter <tmricht@...ux.ibm.com>,
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/7] Perf stat --null/offline CPU segv related
 fixes/tests

* Ingo Molnar <mingo@...nel.org> wrote:

> * Ian Rogers <irogers@...gle.com> wrote:
>
> > Ingo reported [1] that `perf stat --null` was segfaulting. Fix the
> > underlying issue and add a test to the "perf stat tests". Do some
> > related fixing/cleanup in the perf util cpumap code.
> >
> > Thomas reported an issue fixed by the same patches [2] but caused by
> > giving perf stat an offline CPU. Add test coverage for that and
> > improve the "error" message that reports "success".
> >
> > Ingo further pointed at broken signal handling in repeat mode [3]. I
> > observed we weren't giving the best exit code, 0 rather than the
> > expected 128+<signal number>. Add a patch fixing this.
> >
> > [1] https://lore.kernel.org/linux-perf-users/aSwt7yzFjVJCEmVp@gmail.com/
> > [2] https://lore.kernel.org/linux-perf-users/94313b82-888b-4f42-9fb0-4585f9e90080@linux.ibm.com/
> > [3] https://lore.kernel.org/lkml/aS5wjmbAM9ka3M2g@gmail.com/
> >
> > Ian Rogers (7):
> >   perf stat: Allow no events to open if this is a "--null" run
> >   libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map
> >   perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask
> >   perf tests stat: Add "--null" coverage
> >   perf stat: When no events, don't report an error if there is none
> >   perf tests stat: Add test for error for an offline CPU
> >   perf stat: Improve handling of termination by signal
> >
> >  tools/lib/perf/cpumap.c        | 10 +++++----
> >  tools/perf/builtin-stat.c      | 29 ++++++++++++++++++-------
> >  tools/perf/tests/shell/stat.sh | 39 ++++++++++++++++++++++++++++++++++
> >  tools/perf/util/cpumap.c       |  9 ++++++--
> >  4 files changed, 73 insertions(+), 14 deletions(-)
>
> A belated:
>
>   Tested-by: Ingo Molnar <mingo@...nel.org>
>
> And thank you a lot for doing these QoL fixes!

There's one more perf stat QoL bug I'd like to report - I frequently
do repeated runs of perf stat --repeat and grep the output, to get
a feel for the run-to-run stability of a particular benchmark:

  starship:~/tip> while :; do perf stat --null --repeat 3 sleep 0.1 2>&1 | grep elapsed; done 
         0.1017997 +- 0.0000771 seconds time elapsed  ( +-  0.08% )
         0.1017627 +- 0.0000795 seconds time elapsed  ( +-  0.08% )
         0.1018106 +- 0.0000650 seconds time elapsed  ( +-  0.06% )
         0.1017844 +- 0.0000601 seconds time elapsed  ( +-  0.06% )
          0.101883 +- 0.000169 seconds time elapsed  ( +-  0.17% ) <====
         0.1017757 +- 0.0000532 seconds time elapsed  ( +-  0.05% )
         0.1017991 +- 0.0000720 seconds time elapsed  ( +-  0.07% )
         0.1018024 +- 0.0000704 seconds time elapsed  ( +-  0.07% )
         0.1018074 +- 0.0000946 seconds time elapsed  ( +-  0.09% )
         0.1019797 +- 0.0000524 seconds time elapsed  ( +-  0.05% )
         0.1018407 +- 0.0000658 seconds time elapsed  ( +-  0.06% )
         0.1017907 +- 0.0000605 seconds time elapsed  ( +-  0.06% )
         0.1018328 +- 0.0000868 seconds time elapsed  ( +-  0.09% )
         0.1017469 +- 0.0000285 seconds time elapsed  ( +-  0.03% )
         0.1019589 +- 0.0000549 seconds time elapsed  ( +-  0.05% )
         0.1018465 +- 0.0000891 seconds time elapsed  ( +-  0.09% )
          0.101868 +- 0.000117 seconds time elapsed  ( +-  0.12% ) <====
         0.1017705 +- 0.0000590 seconds time elapsed  ( +-  0.06% )
         0.1017728 +- 0.0000718 seconds time elapsed  ( +-  0.07% )
         0.1017821 +- 0.0000419 seconds time elapsed  ( +-  0.04% )
         0.1018328 +- 0.0000581 seconds time elapsed  ( +-  0.06% )
         0.1017836 +- 0.0000853 seconds time elapsed  ( +-  0.08% )
         0.1018124 +- 0.0000765 seconds time elapsed  ( +-  0.08% )
         0.1018706 +- 0.0000639 seconds time elapsed  ( +-  0.06% )

Note the two outliers, which happen due to some misguided
output optimization feature in perf shortening zero-ended
numbers unnecessarily, and adding noise to the grepped
output's vertical alignment.

Those two lines should be:

         0.1017844 +- 0.0000601 seconds time elapsed  ( +-  0.06% )
         0.1018830 +- 0.0001690 seconds time elapsed  ( +-  0.17% ) <====
         0.1017757 +- 0.0000532 seconds time elapsed  ( +-  0.05% )

         0.1018465 +- 0.0000891 seconds time elapsed  ( +-  0.09% )
         0.1018680 +- 0.0001170 seconds time elapsed  ( +-  0.12% ) <====
         0.1017705 +- 0.0000590 seconds time elapsed  ( +-  0.06% )

(The zeroes are printed fully, to full precision.)

Basically random chance causing an apparent lack of significant
numbers doesn't mean the tool should strip them from the output.

Thanks,

	Ingo