lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP4=nvQpS20ubNspE0PPhiyWb3-ARV=gmQzFCA7WwAT8+rxMjg@mail.gmail.com>
Date: Fri, 17 Jan 2025 13:04:07 +0100
From: Tomas Glozar <tglozar@...hat.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org, 
	John Kacur <jkacur@...hat.com>, Luis Goncalves <lgoncalv@...hat.com>, 
	Gabriele Monaco <gmonaco@...hat.com>
Subject: Re: [PATCH 0/5] rtla/timerlat: Stop on signal properly when overloaded

pá 17. 1. 2025 v 1:46 odesílatel Steven Rostedt <rostedt@...dmis.org> napsal:
> Hmm, I wonder if timerlat can handle per cpu data, then you could kick off
> a thread per CPU (or a set of CPUs) where the thread is responsible for
> handling the data.
>
>
>                 CPU_ZERO_S(cpu_size, cpusetp);
>                 CPU_SET_S(cpu, cpu_size, cpusetp);
>                 retval = tracefs_iterate_raw_events(trace->tep,
>                                 trace->inst,
>                                 cpusetp,
>                                 cpu_size,
>                                 collect_registered_events,
>                                                     trace);
>
> And then that iteration will only read over a subset of CPUs. Each thread
> can do a different subset and then it should be able to keep up.
>

That's a good idea, I didn't think of that. But it doesn't help much
in a scenario where rtla is pinned to a few housekeeping CPUs with -H,
which is used for testing isolated-CPU-based setups.

I was thinking of turning timerlat_hist_handler/timerlat_top_handler
into a BPF program and having it executed right after the sample is
created, e.g. by using the BPF perf interface to hook it to a
tracepoint event. The histogram/counter would be stored in BPF maps,
which would be merely copied over in the main loop. This is
essentially how cyclictest does it, except in userspace. I expect this
solution to have good performance, but the obvious downside is that it
requires BPF. This is not a problem for us, but might be for other
rtla users and we'd likely have to keep both implementations of sample
processing in the code.

Also, before even starting with that, it would be likely necessary to
remove the duplicate code throughout timerlat/osnoise and test it
properly, so we don't have to do the same code changes twice or four
times.

Tomas


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ