linux-kernel - Re: Posix process cpu timer inaccuracies

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4547873.LvFx2qVVIh@discovery>
Date: Mon, 26 Feb 2024 16:29:38 -0800
From: Delyan Kratunov <delyan@...yan.me>
To: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: Posix process cpu timer inaccuracies

Thanks for your detailed response, Thomas, I appreciate you taking the time 
with my random side quest!

> [...]
>
> That's wishful thinking and there is no way to ensure that.
> Just for the record: setitimer() has been marked obsolescent in the
> POSIX standard issue 7 in 2018. The replacement is timer_settime() which
> has a few interesting properties vs. the overrun handling.

This is a great point and I think it overrides anything I have to say about 
setitimer. Overall, I have nothing to rehash on the process signal delivery 
point, I understand the situation now, thanks to your thorough explanation!

> [...]
> I don't know and those assumptions have been clearly wrong at the point
> where the tool was written.

That was my impression as well, thanks for confirming. (I've found at least 3 
tools with this same incorrect belief)

> [...]
> > they still have the same distribution issues.
> 
> CLOCK_THREAD_CPUTIME_ID exists for a reason and user space can correlate
> the thread data nicely.
> 
> Aside of that there are PMUs and perf which solve all the problems you
> are trying to solve in one go.

Absolutely, the ability to write a profiler with perf_event_open is not in 
question at all. However, not every situation allows for PMU or 
perf_event_open access. Timers could form a nice middle ground, in exactly the 
way people have tried to use them.

I'd like to push back a little on the "CLOCK_THREAD_CPUTIME_ID fixes things" 
point, though. From an application and library point of view, the per-thread 
clocks are harder to use - you need to either orchestrate every thread to 
participate voluntarily or poll the thread ids and create timers from another 
thread. In perf_event_open, this is solved via the .inherit/.inherit_thread 
bits.

More importantly, they don't work for all workloads. If I have 10 threads that 
each run for 5ms, a 10ms process timer would fire 5 times, while per-thread 
10ms timers would never fire. You can easily imagine an application that 
accrues all its cpu time in a way that doesn't generate a single signal (in 
the extreme, threads only living a single tick).

Overall, what I want to establish is whether there's a path to achieve the 
_assumed_ interface that these tools expect - process-wide cpu signals that 
correlate with where cpu time is spent - through any existing or extended 
timer API. This interface would be imminently useful, as people have clearly,
albeit misguidedly, demonstrated.

If the answer is definitely "no," I'd like to at least add some notes to the 
man pages.

-- Delyan