[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <001201db6c1d$4a0c19c0$de244d40$@telus.net>
Date: Tue, 21 Jan 2025 07:58:18 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Peter Zijlstra'" <peterz@...radead.org>
Cc: <linux-kernel@...r.kernel.org>,
<vincent.guittot@...aro.org>,
"'Ingo Molnar'" <mingo@...nel.org>,
<wuyun.abel@...edance.com>,
"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF
On 2025.01.21 03:22 Peter Zijlstra wrote:
>On Tue, Jan 21, 2025 at 09:49:08AM +0100, Peter Zijlstra wrote:
>
>>> I modified your tracing trigger thing in turbostat to this:
>>
>> Shiny!
>>
>> What turbostat invocation do I use? I think the last I had was:
>>
>> tools/power/x86/turbostat/turbostat --quiet --show Busy%,IRQ,Time_Of_Day_Seconds,CPU,usec --interval 1
>>
>> I've started a new run of yes-vs-turbostate with the modified trigger
>> condition. Lets see what pops out.
>
> Ok, I have a trace.o
>
> So I see turbostat wake up on CPU 15, do its migration round 0-15 and
> when its back at 15 it prints the WHOOPSIE.
>
> (trimmed trace):
>
> yes-1169 [015] dNh4. 4238.261759: sched_wakeup: comm=turbostat pid=1185 prio=100 target_cpu=015
> yes-1169 [015] d..2. 4238.261761: sched_switch: prev_comm=yes prev_pid=1169 prev_prio=120 prev_state=R ==>
next_comm=turbostat next_pid=1185 next_prio=100
> migration/15-158 [015] d..3. 4238.261977: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=15 dest_cpu=0
> migration/0-20 [000] d..3. 4238.261991: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=0 dest_cpu=1
> migration/1-116 [001] d..3. 4238.262003: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=1 dest_cpu=2
> migration/2-25 [002] d..3. 4238.262018: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=2 dest_cpu=3
> migration/3-122 [003] d..3. 4238.262031: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=3 dest_cpu=4
> migration/4-31 [004] d..3. 4238.262044: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=4 dest_cpu=5
> migration/5-128 [005] d..3. 4238.262057: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=5 dest_cpu=6
> migration/6-37 [006] d..3. 4238.262071: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=6 dest_cpu=7
> migration/7-134 [007] d..3. 4238.262084: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=7 dest_cpu=8
> migration/8-43 [008] d..3. 4238.262097: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=8 dest_cpu=9
> migration/9-140 [009] d..3. 4238.262109: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=9 dest_cpu=10
> migration/10-49 [010] d..3. 4238.262123: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=10 dest_cpu=11
> migration/11-146 [011] d..3. 4238.262136: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=11 dest_cpu=12
> migration/12-55 [012] d..3. 4238.262150: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=12 dest_cpu=13
> migration/13-152 [013] d..3. 4238.262164: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=13 dest_cpu=14
> migration/14-62 [014] d..3. 4238.262177: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=14 dest_cpu=15
> yes-1169 [015] d..2. 4238.262182: sched_switch: prev_comm=yes prev_pid=1169 prev_prio=120 prev_state=R+ ==>
next_comm=turbostat next_pid=1185 next_prio=100
> turbostat-1185 [015] ..... 4238.262189: __x64_sys_gettimeofday: WHOOPSIE!
>
> The time between wakeup and whoopsie 4238.262189-4238.261759 = .000430
> or 430us, which doesn't seem excessive to me.
>
> Let me go read this turbostat code to figure out what exactly the
> trigger condition signifies. Because I'm not seeing nothing weird here.
I think the anomaly would have been about 1 second ago, on CPU 15,
and before entering sleep.
But after the previous call to the time of day stuff.
Somewhere in this code:
delta_platform(&platform_counters_even, &platform_counters_odd);
compute_average(ODD_COUNTERS);
format_all_counters(ODD_COUNTERS);
flush_output_stdout();
Please know that I ran a couple of tests yesterday for a total of about 8 hours
and never got a measured interval time >= 10 mSec.
I was using kernel 6.13, which includes your 2 patches, and I tried a slight
modification to the turbostat command:
sudo ./turbostat --quiet --Summary --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,TSC_MHz,Time_Of_Day_Seconds,usec --interval 1 --out
/dev/shm/turbo.log
That allowed me to acquire more than my ssh session history limit of about 9000 lines (seconds) and also eliminated ssh
communications.
It was on purpose that I used RAM to write the log file to.
Powered by blists - more mailing lists