lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aDCsmjb7Fex1ccOW@x1>
Date: Fri, 23 May 2025 14:12:58 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Leo Yan <leo.yan@....com>
Cc: Namhyung Kim <namhyung@...nel.org>, Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	"Liang, Kan" <kan.liang@...ux.intel.com>,
	James Clark <james.clark@...aro.org>,
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf tests switch-tracking: Fix timestamp comparison

On Fri, May 23, 2025 at 09:10:36AM +0100, Leo Yan wrote:
> On Thu, May 22, 2025 at 10:57:41PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Thu, May 22, 2025 at 10:55:46PM -0300, Arnaldo Carvalho de Melo wrote:
> > > On Mon, Mar 31, 2025 at 06:27:59PM +0100, Leo Yan wrote:
> > > > The test might fail on the Arm64 platform with the error:

> > > >   perf test -vvv "Track with sched_switch"
> > > >   Missing sched_switch events

> > > > The issue is caused by incorrect handling of timestamp comparisons. The
> > > > comparison result, a signed 64-bit value, was being directly cast to an
> > > > int, leading to incorrect sorting for sched events.

> > > > Fix this by explicitly returning 0, 1, or -1 based on whether the result
> > > > is zero, positive, or negative.

> > > > Fixes: d44bc5582972 ("perf tests: Add a test for tracking with sched_switch")
> > > > Signed-off-by: Leo Yan <leo.yan@....com>

> > > How can I reproduce this?

> > > Testing on a rpi5, 64-bit debian, this test passes:
> 
> Sorry that I did not give precise info for reproducing the failure.
> The case does not fail everytime, usually I can trigger the failure
> after run 20 ~ 30 times:
> 
> # while true; do perf test "Track with sched_switch"; done
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : FAILED!
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : FAILED!
> 106: Track with sched_switch                                         : Ok
> 106: Track with sched_switch                                         : Ok
 
> I used cross compiler to build Perf tool on my host machine and tested on
> Debian / Juno board.  Generally, I think this issue is not very specific
> to GCC versions.  As both internal CI and my local env can reproduce the
> issue.
 
> Please let me know if need any more info.  Thanks!
 
> ---8<---
 
> My Host Build compiler:
 
> # aarch64-linux-gnu-gcc --version
> aarch64-linux-gnu-gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
 
> Juno Board:
 
> # lsb_release -a
> No LSB modules are available.
> Distributor ID: Debian
> Description:    Debian GNU/Linux 12 (bookworm)
> Release:        12
> Codename:       bookworm

Thanks for the extra info, I'll add it to the commit log message, and
perhaps we could make this test exclusive and use stress-ng to generate
some background noise in the form of a good number of processes, see:

root@x1:~# stress-ng --switch $(($(nproc) * 2)) --timeout 30s & for a in $(seq 50) ; do perf test switch ; done
[1] 1773322
stress-ng: info:  [1773322] setting to a 30 secs run per stressor
 77: Track with sched_switch                          : Running (1 active)
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : Running (1 active)
stress-ng: info:  [1773322] skipped: 0
stress-ng: info:  [1773322] passed: 24: switch (24)
stress-ng: info:  [1773322] failed: 0
stress-ng: info:  [1773322] metrics untrustworthy: 0
 77: Track with sched_switch                          : FAILED!
[1]+  Done                    stress-ng --switch $(($(nproc) * 2)) --timeout 30s
 77: Track with sched_switch                          : Ok
 77: Track with sched_switch                          : Ok
 77: Track with sched_switch                          : Ok
 77: Track with sched_switch                          : Ok
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : Ok
 77: Track with sched_switch                          : Ok
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : FAILED!
 77: Track with sched_switch                          : Ok
root@x1:~#

Now with your patch it also fails, so its for another reason:

--- start ---
test child forked, pid 1777071
Using CPUID GenuineIntel-6-BA-3
mmap size 528384B
45221 events recorded
Missing comm events
---- end(-1) ----
113: Track with sched_switch                                         : FAILED!

Lots of short lived processes makes it fail as well :-\

Oh well...

I was just trying to improve this test case so that we would show it
failing before your patch and passing after it, but I ran out of time
:-\

Your patch is correct, so I'll probably just add your comments and go
with it.

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ