lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <62eb142e-329b-4faa-8750-2d92d4a37d3c@oss.qualcomm.com>
Date: Tue, 2 Dec 2025 15:13:12 +0800
From: Yin Li <yin.li@....qualcomm.com>
To: yin.li@....qualcomm.com
Cc: linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
        quic_okukatla@...cinc.com
Subject: test-by-test

Hi Georgi,

on 16 May 2025 18:50:15 +0300, Georgi Djakov wrote:
 >Hi Mike,
 >...
 >
 >> To prevent this priority inversion, switch to using rt_mutex for
 >> icc_bw_lock. This isn't needed for icc_lock since that's not used in the
 >> critical, latency-sensitive voting paths.
 >
 >If the issue does not occur anymore with this patch, then this is a good
 >sign, but we still need to get some numbers and put them in the commit
 >message.


We constructed a priority inversion test scenario, which included 
multiple real-time threads with different priorities and CFS threads 
with different nice values ​​competing for a mutex, to verify the 
overhead of the RT thread acquiring the lock mutex.

The maximum, minimum, and average of overhead were determined through 
100 iterations of testing.

Then replace the mutex with an rt-mutex and perform the same test, 
obtaining the overhead's max, min, and average through 100 loops. 
Calculate the change in average.

Finally we can draw the conclusion:
1) In a scenario where the overhead of threads competing for a mutex is 
set to 5ms, using a mutex will result in an average overhead of 
4127687ns for the tested rt threads to acquire the mutex.

2) After replacing the mutex with rt-mutex, the latency can be reduced 
to 2010555ns, which greatly improves the mutex overhead brought by 
priority inversion and reduces latency by about 50%.

3) Furthermore, to align with the user's given overhead of 40ms, the 
test case was modified to have a competing mutex thread overhead of
40ms, and the experiment was repeated, yielding similar results.



 >The RT mutexes add some overhead and complexity that could
 >increase latency for both uncontended and contended paths. I am curious
 >if there is any regression for the non-priority scenarios. Also if >there
 >are many threads, the mutex cost itself could become a bottleneck.

After testing, the overhead of a single rt-mutex is approximately 937ns, 
and the overhead of a single mutex is approximately 520ns. The overhead 
of a single rt-mutex does indeed lead to more latency.

However, in scenarios where multiple clients frequently access the 
interconnect API, the latency of using mutexes far outweighs the 
overhead added by rt-mutexes themselves.

Compared to the performance improvement of rt-mutex in a 
thread-contention environment, the latency itself is perfectly acceptable.


 >This pulls in unconditionally all the RT-mutex stuff, which some people
 >might not want (although today it's also selected by the I2C subsystem
 >for example). I am wondering if we should make it configurable with the
 >normal mutex being the default or just follow the i2c example... but
 >maybe we can decide this when we have some numbers.

Making locks configurable is not a common practice. We do not intend to 
make changes in this patch.



on 7 Sep 2022 08:15:21 +0000, David Laigh wrote:
 >From: Georgi Djakov
 > ...
 >I can't see why the RT kernel doesn't have exactly the same issues.
 >Given how long a process switch takes I really suspect that most
 >spinlocks should remain spinlocks.

It was proposed that serializing with a spinlock might be a simpler 
solution, but we cannot do that as holding the lock we call 
wait_for_completion_timeout in the RPM/RPMh which takes a mutex and 
could sleep in the atomic context.



-- 
Thx and BRs,
Yin


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ