[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <001b01db608a$56d3dc40$047b94c0$@telus.net>
Date: Mon, 6 Jan 2025 14:28:40 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Peter Zijlstra'" <peterz@...radead.org>
Cc: <linux-kernel@...r.kernel.org>,
<vincent.guittot@...aro.org>,
"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF
On 2025.01.06 09:05 Peter Zijlstra wrote:
> On Mon, Jan 06, 2025 at 05:59:32PM +0100, Peter Zijlstra wrote:
>> On Mon, Jan 06, 2025 at 07:01:34AM -0800, Doug Smythies wrote:
>>
>>>> What is the easiest 100% load you're seeing this with?
>>>
>>> Lately, and specifically to be able to tell others, I have been using:
>>>
>>> yes > /dev/null &
>>>
>>> On my Intel i5-10600K, with 6 cores and 2 threads per core, 12 CPUs,
>>> I run 12 of those work loads.
>>
>> On my headless ivb-ep 2 sockets, 10 cores each and 2 threads per core, I
>> do:
>>
>> for ((i=0; i<40; i++)) ; do yes > /dev/null & done
>> tools/power/x86/turbostat/turbostat --quiet --Summary --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,TSC_MHz --interval 1
>>
>> But no so far, nada :-( I've tried with full preemption and voluntary,
>> HZ=1000.
My HZ=1000 also. And: CONFIG_NO_HZ_FULL=y
>
> And just as I send this, I see these happen:
>
> 100.00 3100 2793 40302 71 195.22
> 100.00 3100 2618 40459 72 183.58
> 100.00 3100 2993 46215 71 209.21
> 100.00 3100 2789 40467 71 195.19
> 99.92 3100 2798 40589 71 195.76
> 100.00 3100 2793 40397 72 195.46
> ...
> 100.00 3100 2844 41906 71 199.43
> 100.00 3100 2779 40468 71 194.51
> 99.96 3100 2320 40933 71 163.23
> 100.00 3100 3529 61823 72 245.70
> 100.00 3100 2793 40493 72 195.45
> 100.00 3100 2793 40462 72 195.56
>
> They look like funny little blips. Nowhere near as bad as you had
> though.
Yes, I get a lot of the lesser magnitude ones.
The large magnitude ones are very much a function of what else is running.
If just add a 0.5% load at 73 hertz work/sleep frequency, then over a 2 hour and
31 minute test I got a maximum interval time of 1.68 seconds.
Without that small pertibations I got tons of interval times of 7 seconds,
Meaning the regular 1 second interval plus 6 seconds for the CPU migration.
Since I can not seem to function without making a graph, some example graphs
are attached.
By the way, and to make it easier to go away while tests run, I am now using this
turbostat command:
doug@s19:~/kernel/linux/tools/power/x86/turbostat$ sudo ./turbostat --quiet --show Busy%,IRQ,Time_Of_Day_Seconds,CPU,usec --interval
1 | grep "^[1-9]"
6005701 1736201357.014741 - 99.76 12034
177731 1736201386.221771 - 99.76 12034
6003699 1736201393.226740 - 99.76 14167
6003704 1736201422.253743 - 99.76 12040
6005700 1736201447.278740 - 99.76 12030
311699 1736201475.816740 - 99.76 12033
Which will show when a CPU migration took over 10 milliseconds.
If you want to go further, for example to only display ones that took
over a second and to include the target CPU, then patch turbostat:
doug@s19:~/kernel/linux/tools/power/x86/turbostat$ git diff
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 58a487c225a7..f8a73cc8fbfc 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2704,7 +2704,7 @@ int format_counters(struct thread_data *t, struct core_data *c, struct pkg_data
struct timeval tv;
timersub(&t->tv_end, &t->tv_begin, &tv);
- outp += sprintf(outp, "%5ld\t", tv.tv_sec * 1000000 + tv.tv_usec);
+ outp += sprintf(outp, "%7ld\t", tv.tv_sec * 1000000 + tv.tv_usec);
}
/* Time_Of_Day_Seconds: on each row, print sec.usec last timestamp taken */
@@ -4570,12 +4570,14 @@ int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
int i;
int status;
+ gettimeofday(&t->tv_begin, (struct timezone *)NULL); /* doug test */
+
if (cpu_migrate(cpu)) {
fprintf(outf, "%s: Could not migrate to CPU %d\n", __func__, cpu);
return -1;
}
- gettimeofday(&t->tv_begin, (struct timezone *)NULL);
+// gettimeofday(&t->tv_begin, (struct timezone *)NULL);
if (first_counter_read)
get_apic_id(t);
Example output:
sudo ./turbostat --quiet --show Busy%,IRQ,Time_Of_Day_Seconds,CPU,usec --interval 1 | grep "^[1-9]"
1330709 1736202049.987740 - 99.76 12040
1330603 1736202049.987740 11 99.76 1003
6008710 1736202068.008741 - 99.76 12030
6008601 1736202068.008741 11 99.76 1003
2003709 1736202120.936740 - 99.76 12028
2003603 1736202120.936740 11 99.76 1002
6005710 1736202140.956741 - 99.76 12028
6005604 1736202140.956741 11 99.76 1002
In this short example all captures were for the CPU 5 to 11 migration.
2 at 6 seconds, 1 at 1.33 seconds and 1 at 2 seconds.
I'll try, and report on, your test patch from the other email later.
Download attachment "turbostat-sampling-issue-seconds.png" of type "image/png" (32233 bytes)
Download attachment "turbostat-sampling-issue-seconds-detail-a.png" of type "image/png" (42780 bytes)
Download attachment "turbostat-sampling-issue-seconds-detail-b.png" of type "image/png" (87699 bytes)
Powered by blists - more mailing lists