linux-kernel - RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <004901db6a06$59b12050$0d1360f0$@telus.net>
Date: Sat, 18 Jan 2025 16:09:02 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Peter Zijlstra'" <peterz@...radead.org>
Cc: <linux-kernel@...r.kernel.org>,
	<vincent.guittot@...aro.org>,
	"'Ingo Molnar'" <mingo@...nel.org>,
	<wuyun.abel@...edance.com>,
	"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

Hi Peter,

An update.

On 2025.01.14 02:59 Peter Zijlstra wrote:
> On Mon, Jan 13, 2025 at 12:03:12PM +0100, Peter Zijlstra wrote:
>> On Sun, Jan 12, 2025 at 03:14:17PM -0800, Doug Smythies wrote:

>>> means that there were 19 occurrences of turbostat interval times
>>> between 1.016 and 1.016999 seconds.
>> 
>> OK, let me lower my threshold to 10ms and change the turbostat
>> invocation -- see if I can catch me some wabbits :-)
>
> I've had it run overnight and have not caught a single >10ms event :-(

Okay, so both you and I have many many hours of testing and
never see >= 10ms in that area of the turbostat code anymore.

The lingering >= 10ms (but I have never seen more than 25 ms)
is outside of that timing. As previously reported, I thought it might
be in the sampling interval sleep step, but I did a bunch of testing
and it doesn't appear to be there. That leaves:

delta_platform(&platform_counters_even, &platform_counters_odd);
compute_average(ODD_COUNTERS);
format_all_counters(ODD_COUNTERS);
flush_output_stdout();

I modified your tracing trigger thing in turbostat to this:

doug@s19:~/kernel/linux/tools/power/x86/turbostat$ git diff turbostat.c
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 58a487c225a7..777efb64a754 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -67,6 +67,7 @@
 #include <stdbool.h>
 #include <assert.h>
 #include <linux/kernel.h>
+#include <sys/syscall.h>

 #define UNUSED(x) (void)(x)

@@ -2704,7 +2705,7 @@ int format_counters(struct thread_data *t, struct core_data *c, struct pkg_data
                struct timeval tv;

                timersub(&t->tv_end, &t->tv_begin, &tv);
-               outp += sprintf(outp, "%5ld\t", tv.tv_sec * 1000000 + tv.tv_usec);
+               outp += sprintf(outp, "%7ld\t", tv.tv_sec * 1000000 + tv.tv_usec);
        }

        /* Time_Of_Day_Seconds: on each row, print sec.usec last timestamp taken */
@@ -2713,6 +2714,11 @@ int format_counters(struct thread_data *t, struct core_data *c, struct pkg_data

        interval_float = t->tv_delta.tv_sec + t->tv_delta.tv_usec / 1000000.0;

+       double requested_interval = (double) interval_tv.tv_sec + (double) interval_tv.tv_usec / 1000000.0;
+
+       if(interval_float >= (requested_interval + 0.01))  /* was the last interval over by more than 10 mSec? */
+               syscall(__NR_gettimeofday, &tv_delta, (void*)1);
+
        tsc = t->tsc * tsc_tweak;

        /* topo columns, print blanks on 1st (average) line */
@@ -4570,12 +4576,14 @@ int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
        int i;
        int status;

+       gettimeofday(&t->tv_begin, (struct timezone *)NULL); /* doug test */
+
        if (cpu_migrate(cpu)) {
                fprintf(outf, "%s: Could not migrate to CPU %d\n", __func__, cpu);
                return -1;
        }

-       gettimeofday(&t->tv_begin, (struct timezone *)NULL);
+//     gettimeofday(&t->tv_begin, (struct timezone *)NULL);

        if (first_counter_read)
                get_apic_id(t);

And so that I could prove a correlation with the trace times
and to my graph times I also did not turn off tracing upon a hit:

doug@s19:~/kernel/linux$ git diff kernel/time/time.c
diff --git a/kernel/time/time.c b/kernel/time/time.c
index 1b69caa87480..fb84915159cc 100644
--- a/kernel/time/time.c
+++ b/kernel/time/time.c
@@ -149,6 +149,12 @@ SYSCALL_DEFINE2(gettimeofday, struct __kernel_old_timeval __user *, tv,
                        return -EFAULT;
        }
        if (unlikely(tz != NULL)) {
+               if (tz == (void*)1) {
+                       trace_printk("WHOOPSIE!\n");
+//                     tracing_off();
+                       return 0;
+               }
+
                if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
                        return -EFAULT;
        }

I ran a test for about 1 hour and 28 minutes.
The data in the trace correlates with turbostat line by line
TOD differentials. Trace got:

turbostat-1370    [011] .....   751.738151: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....   760.763184: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1362.788298: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1365.815332: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1366.836340: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1367.856355: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1368.867365: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1373.893423: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1374.910439: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1377.928469: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1378.941483: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1379.959490: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1382.982525: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1385.005548: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1386.019561: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1387.030572: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1398.097683: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1620.752963: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1621.772969: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1622.788972: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1697.022098: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1703.071104: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1704.088103: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1705.105107: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1706.116106: __x64_sys_gettimeofday: WHOOPSIE!
turbostat-1370    [011] .....  1707.126107: __x64_sys_gettimeofday: WHOOPSIE!

Going back to some old test data from when the CPU migration in turbostat
often took up to 6 seconds. If I subtract that migration time from the measured
interval time, I get a lot of samples between 10 and 23 ms.

I am saying there were 2 different issues. The 2nd was hidden by the 1st
because its magnitude was about 260 times less.

I do not know if my trace is any use. I'll compress it and send it to you only, off list.
My trace is as per this older email:

https://lore.kernel.org/all/20240727105030.226163742@infradead.org/T/#m453062b267551ff4786d33a2eb5f326f92241e96