lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <26411.57288.238690.681680@gargle.gargle.HOWL>
Date: Wed, 6 Nov 2024 22:29:44 +0100
From: Anthony Mallet <anthony.mallet@...s.fr>
To: Anna-Maria Behnsen <anna-maria@...utronix.de>,
        Frederic Weisbecker <frederic@...nel.org>
Cc: linux-kernel@...r.kernel.org
Subject: posix timer freeze after some random time, under pthread create/destroy load

Hi,

I'm facing an issue with posix timers configured to send SIGALRM
signal upon expiry. The symptom is that the timer randomly freezes
(the signal handler not triggered anymore). After analysis, this happens
in combination with pthreads creation / destruction.

I have attached a test case that can reliably reproduce my issue on
affected kernels. It involves creating a timer that increments a
global counter at each tick, while the main thread is spawning and
destroying other threads. At some point, the counter gets stalled. In
the context of this test case, I do heavy thread creation and
destruction, so that the issue triggers almost immediately. Regarding
the real-world issue, it happens in the context of aio(7) work, which
also involves thread creation and destruction but presumably at a much
lower rate, and the issue consequently triggers much less often.

I could reproduce the issue reliably with mainline kernels from 6.4
to 6.11 (included), and on several distributions, different hardware
and glibc versions. Kernels earlier than 6.3 (included) do not exhibit
the problem at all.

Once the issue triggers, simply resetting the timer (with
timer_settime(2)) makes it work again, until next
stall. timer_gettime(2) does not show garbage and the values are still
as expected. Only the signal handler is not called. Manually sending
SIGALRM with raise(SIGALRM) also works and invokes the signal handler
as expected.

Also note that using setitimer(2) instead of a posix timer does not
show any problem with the same test program.

Before filling a proper bug report, I wanted to have your opinion
about this. This e-mail is already probably too long for an
introduction, but I can of course provide you with any missing detail
that you would deem necessary.

Thanks for you attention,
Anthony Mallet


View attachment "timer.c" of type "text/plain" (2907 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ