[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <00e301db6e19$52e3cc20$f8ab6460$@telus.net>
Date: Thu, 23 Jan 2025 20:34:57 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Peter Zijlstra'" <peterz@...radead.org>
Cc: <linux-kernel@...r.kernel.org>,
<vincent.guittot@...aro.org>,
"'Ingo Molnar'" <mingo@...nel.org>,
<wuyun.abel@...edance.com>,
"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF
On 2025.01.21 Doug Smythies wrote:
> On 2025.01.21 03:22 Peter Zijlstra wrote:
>> On Tue, Jan 21, 2025 at 09:49:08AM +0100, Peter Zijlstra wrote:
>>
>>>> I modified your tracing trigger thing in turbostat to this:
>>>
>>> Shiny!
>>>
>>> What turbostat invocation do I use? I think the last I had was:
>>>
>>> tools/power/x86/turbostat/turbostat --quiet --show Busy%,IRQ,Time_Of_Day_Seconds,CPU,usec --interval 1
>>>
>>> I've started a new run of yes-vs-turbostate with the modified trigger
>>> condition. Lets see what pops out.
>>
>> Ok, I have a trace.o
>>
>> So I see turbostat wake up on CPU 15, do its migration round 0-15 and
>> when its back at 15 it prints the WHOOPSIE.
>>
>> (trimmed trace):
>>
>> yes-1169 [015] dNh4. 4238.261759: sched_wakeup: comm=turbostat pid=1185 prio=100 target_cpu=015
>> yes-1169 [015] d..2. 4238.261761: sched_switch: prev_comm=yes prev_pid=1169 prev_prio=120 prev_state=R ==>
> next_comm=turbostat next_pid=1185 next_prio=100
>> migration/15-158 [015] d..3. 4238.261977: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=15 dest_cpu=0
>> migration/0-20 [000] d..3. 4238.261991: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=0 dest_cpu=1
>> migration/1-116 [001] d..3. 4238.262003: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=1 dest_cpu=2
>> migration/2-25 [002] d..3. 4238.262018: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=2 dest_cpu=3
>> migration/3-122 [003] d..3. 4238.262031: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=3 dest_cpu=4
>> migration/4-31 [004] d..3. 4238.262044: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=4 dest_cpu=5
>> migration/5-128 [005] d..3. 4238.262057: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=5 dest_cpu=6
>> migration/6-37 [006] d..3. 4238.262071: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=6 dest_cpu=7
>> migration/7-134 [007] d..3. 4238.262084: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=7 dest_cpu=8
>> migration/8-43 [008] d..3. 4238.262097: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=8 dest_cpu=9
>> migration/9-140 [009] d..3. 4238.262109: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=9 dest_cpu=10
>> migration/10-49 [010] d..3. 4238.262123: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=10 dest_cpu=11
>> migration/11-146 [011] d..3. 4238.262136: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=11 dest_cpu=12
>> migration/12-55 [012] d..3. 4238.262150: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=12 dest_cpu=13
>> migration/13-152 [013] d..3. 4238.262164: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=13 dest_cpu=14
>> migration/14-62 [014] d..3. 4238.262177: sched_migrate_task: comm=turbostat pid=1185 prio=100 orig_cpu=14 dest_cpu=15
>> yes-1169 [015] d..2. 4238.262182: sched_switch: prev_comm=yes prev_pid=1169 prev_prio=120 prev_state=R+ ==>
> next_comm=turbostat next_pid=1185 next_prio=100
>> turbostat-1185 [015] ..... 4238.262189: __x64_sys_gettimeofday: WHOOPSIE!
>>
>> The time between wakeup and whoopsie 4238.262189-4238.261759 = .000430
>> or 430us, which doesn't seem excessive to me.
>>
>> Let me go read this turbostat code to figure out what exactly the
>> trigger condition signifies. Because I'm not seeing nothing weird here.
>
> I think the anomaly would have been about 1 second ago, on CPU 15,
> and before entering sleep.
> But after the previous call to the time of day stuff.
>
> Somewhere in this code:
>
> delta_platform(&platform_counters_even, &platform_counters_odd);
> compute_average(ODD_COUNTERS);
> format_all_counters(ODD_COUNTERS);
> flush_output_stdout();
>
> Please know that I ran a couple of tests yesterday for a total of about 8 hours
> and never got a measured interval time >= 10 mSec.
> I was using kernel 6.13, which includes your 2 patches, and I tried a slight
> modification to the turbostat command:
>
> sudo ./turbostat --quiet --Summary --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,TSC_MHz,Time_Of_Day_Seconds,usec --interval 1 --out
/dev/shm/turbo.log
>
> That allowed me to acquire more than my ssh session history limit of about 9000 lines (seconds) and also eliminated ssh
> communications.
> It was on purpose that I used RAM to write the log file to.
I have run more tests over the last couple of days, totalling over 30 hours.
I simply do not get a measured interval time >= 10mSec using kernel 6.13.
The previous work was kernel 6.13-rc6 + the 2 patches + the tracing stuff.
I never tried kernel 6.13-rc7.
Powered by blists - more mailing lists