linux-kernel - RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <001b01db608a$56d3dc40$047b94c0$@telus.net>
Date: Mon, 6 Jan 2025 14:28:40 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Peter Zijlstra'" <peterz@...radead.org>
Cc: <linux-kernel@...r.kernel.org>,
	<vincent.guittot@...aro.org>,
	"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

On 2025.01.06 09:05 Peter Zijlstra wrote:
> On Mon, Jan 06, 2025 at 05:59:32PM +0100, Peter Zijlstra wrote:
>> On Mon, Jan 06, 2025 at 07:01:34AM -0800, Doug Smythies wrote:
>> 
>>>> What is the easiest 100% load you're seeing this with?
>>> 
>>> Lately, and specifically to be able to tell others, I have been using:
>>> 
>>> yes > /dev/null &
>>> 
>>> On my Intel i5-10600K, with 6 cores and 2 threads per core, 12 CPUs,
>>> I run 12 of those work loads.
>> 
>> On my headless ivb-ep 2 sockets, 10 cores each and 2 threads per core, I
>> do:
>> 
>> for ((i=0; i<40; i++)) ; do yes > /dev/null & done
>> tools/power/x86/turbostat/turbostat --quiet --Summary --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,TSC_MHz --interval 1
>> 
>> But no so far, nada :-( I've tried with full preemption and voluntary,
>> HZ=1000.

My HZ=1000 also. And: CONFIG_NO_HZ_FULL=y
 
>
> And just as I send this, I see these happen:
>
> 100.00  3100    2793    40302   71      195.22
> 100.00  3100    2618    40459   72      183.58
> 100.00  3100    2993    46215   71      209.21
> 100.00  3100    2789    40467   71      195.19
> 99.92   3100    2798    40589   71      195.76
> 100.00  3100    2793    40397   72      195.46
> ...
> 100.00  3100    2844    41906   71      199.43
> 100.00  3100    2779    40468   71      194.51
> 99.96   3100    2320    40933   71      163.23
> 100.00  3100    3529    61823   72      245.70
> 100.00  3100    2793    40493   72      195.45
> 100.00  3100    2793    40462   72      195.56
>
> They look like funny little blips. Nowhere near as bad as you had
> though.

Yes, I get a lot of the lesser magnitude ones.

The large magnitude ones are very much a function of what else is running.
If just add a 0.5% load at 73 hertz work/sleep frequency, then over a 2 hour and
31 minute test I got a maximum interval time of 1.68 seconds.
Without that small pertibations I got tons of interval times of 7 seconds,
Meaning the regular 1 second interval plus 6 seconds for the CPU migration.

Since I can not seem to function without making a graph, some example graphs
are attached.

By the way, and to make it easier to go away while tests run, I am now using this
turbostat command:

doug@s19:~/kernel/linux/tools/power/x86/turbostat$ sudo ./turbostat --quiet --show Busy%,IRQ,Time_Of_Day_Seconds,CPU,usec --interval
1 | grep "^[1-9]"
6005701 1736201357.014741       -       99.76   12034
177731  1736201386.221771       -       99.76   12034
6003699 1736201393.226740       -       99.76   14167
6003704 1736201422.253743       -       99.76   12040
6005700 1736201447.278740       -       99.76   12030
311699  1736201475.816740       -       99.76   12033

Which will show when a CPU migration took over 10 milliseconds.
If you want to go further, for example to only display ones that took
over a second and to include the target CPU, then patch turbostat:

doug@s19:~/kernel/linux/tools/power/x86/turbostat$ git diff
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 58a487c225a7..f8a73cc8fbfc 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2704,7 +2704,7 @@ int format_counters(struct thread_data *t, struct core_data *c, struct pkg_data
                struct timeval tv;

                timersub(&t->tv_end, &t->tv_begin, &tv);
-               outp += sprintf(outp, "%5ld\t", tv.tv_sec * 1000000 + tv.tv_usec);
+               outp += sprintf(outp, "%7ld\t", tv.tv_sec * 1000000 + tv.tv_usec);
        }

        /* Time_Of_Day_Seconds: on each row, print sec.usec last timestamp taken */
@@ -4570,12 +4570,14 @@ int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
        int i;
        int status;

+       gettimeofday(&t->tv_begin, (struct timezone *)NULL); /* doug test */
+
        if (cpu_migrate(cpu)) {
                fprintf(outf, "%s: Could not migrate to CPU %d\n", __func__, cpu);
                return -1;
        }

-       gettimeofday(&t->tv_begin, (struct timezone *)NULL);
+//     gettimeofday(&t->tv_begin, (struct timezone *)NULL);

        if (first_counter_read)
                get_apic_id(t);

Example output:

sudo ./turbostat --quiet --show Busy%,IRQ,Time_Of_Day_Seconds,CPU,usec --interval 1 | grep "^[1-9]"
1330709 1736202049.987740       -       99.76   12040
1330603 1736202049.987740       11      99.76   1003
6008710 1736202068.008741       -       99.76   12030
6008601 1736202068.008741       11      99.76   1003
2003709 1736202120.936740       -       99.76   12028
2003603 1736202120.936740       11      99.76   1002
6005710 1736202140.956741       -       99.76   12028
6005604 1736202140.956741       11      99.76   1002

In this short example all captures were for the CPU 5 to 11 migration.
2 at 6 seconds, 1 at 1.33 seconds and 1 at 2 seconds.

I'll try, and report on, your test patch from the other email later.


Download attachment "turbostat-sampling-issue-seconds.png" of type "image/png" (32233 bytes)

Download attachment "turbostat-sampling-issue-seconds-detail-a.png" of type "image/png" (42780 bytes)

Download attachment "turbostat-sampling-issue-seconds-detail-b.png" of type "image/png" (87699 bytes)