linux-kernel - RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <00e301db6e19$52e3cc20$f8ab6460$@telus.net>
Date: Thu, 23 Jan 2025 20:34:57 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Peter Zijlstra'" <peterz@...radead.org>
Cc: <linux-kernel@...r.kernel.org>,
	<vincent.guittot@...aro.org>,
	"'Ingo Molnar'" <mingo@...nel.org>,
	<wuyun.abel@...edance.com>,
	"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

On 2025.01.21 Doug Smythies wrote:
> On 2025.01.21 03:22 Peter Zijlstra wrote:
>> On Tue, Jan 21, 2025 at 09:49:08AM +0100, Peter Zijlstra wrote:
>>
>>>> I modified your tracing trigger thing in turbostat to this:
>>> 
>>> Shiny!
>>> 
>>> What turbostat invocation do I use? I think the last I had was:
>>> 
>>>   tools/power/x86/turbostat/turbostat --quiet --show Busy%,IRQ,Time_Of_Day_Seconds,CPU,usec --interval 1
>>> 
>>> I've started a new run of yes-vs-turbostate with the modified trigger
>>> condition. Lets see what pops out.
>>
>> Ok, I have a trace.o
>>
>> So I see turbostat wake up on CPU 15, do its migration round 0-15 and
>> when its back at 15 it prints the WHOOPSIE.
>>
>> (trimmed trace):
>>
>>             yes-1169    [015] dNh4.  4238.261759: sched_wakeup: comm=turbostat pid=1185 prio=100 target_cpu=015
>>             yes-1169    [015] d..2.  4238.261761: sched_switch: prev_comm=yes prev_pid=1169 prev_prio=120 prev_state=R ==>
> next_comm=turbostat next_pid=1185 next_prio=100
>>    migration/15-158     [015] d..3.  4238.261977: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=15 dest_cpu=0
>>    migration/0-20      [000] d..3.  4238.261991: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=0 dest_cpu=1
>>     migration/1-116     [001] d..3.  4238.262003: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=1 dest_cpu=2
>>     migration/2-25      [002] d..3.  4238.262018: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=2 dest_cpu=3
>>     migration/3-122     [003] d..3.  4238.262031: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=3 dest_cpu=4
>>     migration/4-31      [004] d..3.  4238.262044: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=4 dest_cpu=5
>>     migration/5-128     [005] d..3.  4238.262057: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=5 dest_cpu=6
>>     migration/6-37      [006] d..3.  4238.262071: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=6 dest_cpu=7
>>     migration/7-134     [007] d..3.  4238.262084: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=7 dest_cpu=8
>>     migration/8-43      [008] d..3.  4238.262097: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=8 dest_cpu=9
>>     migration/9-140     [009] d..3.  4238.262109: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=9 dest_cpu=10
>>    migration/10-49      [010] d..3.  4238.262123: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=10 dest_cpu=11
>>    migration/11-146     [011] d..3.  4238.262136: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=11 dest_cpu=12
>>    migration/12-55      [012] d..3.  4238.262150: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=12 dest_cpu=13
>>    migration/13-152     [013] d..3.  4238.262164: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=13 dest_cpu=14
>>    migration/14-62      [014] d..3.  4238.262177: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=14 dest_cpu=15
>>             yes-1169    [015] d..2.  4238.262182: sched_switch: prev_comm=yes prev_pid=1169 prev_prio=120 prev_state=R+ ==>
> next_comm=turbostat next_pid=1185 next_prio=100
>>       turbostat-1185    [015] .....  4238.262189: __x64_sys_gettimeofday: WHOOPSIE!
>>
>> The time between wakeup and whoopsie 4238.262189-4238.261759 = .000430
>> or 430us, which doesn't seem excessive to me.
>>
>> Let me go read this turbostat code to figure out what exactly the
>> trigger condition signifies. Because I'm not seeing nothing weird here.
>
> I think the anomaly would have been about 1 second ago, on CPU 15,
> and before entering sleep.
> But after the previous call to the time of day stuff.
>
> Somewhere in this code:
>
>  delta_platform(&platform_counters_even, &platform_counters_odd);
>  compute_average(ODD_COUNTERS);
>  format_all_counters(ODD_COUNTERS);
>  flush_output_stdout();
>
> Please know that I ran a couple of tests yesterday for a total of about 8 hours
> and never got a measured interval time >= 10 mSec.
> I was using kernel 6.13, which includes your 2 patches, and I tried a slight
> modification to the turbostat command:
>
> sudo ./turbostat --quiet --Summary --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,TSC_MHz,Time_Of_Day_Seconds,usec --interval 1 --out
/dev/shm/turbo.log
>
> That allowed me to acquire more than my ssh session history limit of about 9000 lines (seconds) and also eliminated ssh
> communications.
> It was on purpose that I used RAM to write the log file to.

I have run more tests over the last couple of days, totalling over 30 hours.
I simply do not get a measured interval time >= 10mSec using kernel 6.13.
The previous work was kernel 6.13-rc6 + the 2 patches + the tracing stuff.
I never tried kernel 6.13-rc7.