linux-kernel - [RT] should pm_qos_resume_latency_us on one CPU affect latency on another?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <4b3bf6d8-7e1a-138b-048d-b3c1f5f65297@windriver.com>
Date:   Tue, 13 Aug 2019 15:04:39 -0600
From:   Chris Friesen <chris.friesen@...driver.com>
To:     LKML <linux-kernel@...r.kernel.org>,
        linux-rt-users <linux-rt-users@...r.kernel.org>
Subject: [RT] should pm_qos_resume_latency_us on one CPU affect latency on
 another?

Hi all,

Just wondering if what I'm seeing is expected.  I'm using the CentOS 7 
RT kernel with boot args of "skew_tick=1 irqaffinity=0 rcu_nocbs=1-27 
nohz_full=1-27" among others.

Normally if I run cyclictest it sets /dev/cpu_dma_latency to zero.  This 
gives worst-case latency around 6usec.

If I set /dev/cpu_dma_latency to something large and then set 
/sys/devices/system/cpu/cpu${num}/power/pm_qos_resume_latency_us to "2" 
for the CPUs that cyclictest is running on then the worst-case latency 
jumps to more like 16usec.

If I set pm_qos_resume_latency_us to "2" for all CPUs on the system, 
then the worst-case latency comes back down.  It's not sufficient to set 
it for all CPUs on the same socket as cyclictest.

It does not seem to make any difference in the worst-case latency to set 
cpuset.sched_load_balance to zero for the cpuset containing cyclictest. 
(All cpusets but one have cpuset.sched_load_balance set to zero, and 
that one doesn't include the CPUs that cyclictest runs on.)

Looking at the latency traces, there does not appear to be any single 
culprit.  I've seen cases where it appears to take extra time in 
migrate_task_rq_fair(), tick_do_update_jiffies64(), rcu_irq_enter(), and 
enqueue_entity().

I'm trying to dynamically isolate CPUs from the system for running RT 
tasks, but it seems like the rest of the system still affects the 
isolated CPUs.

Any comments/suggestions would be appreciated.

Thanks,
Chris