linux-kernel - Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250113110312.GD5388@noisy.programming.kicks-ass.net>
Date: Mon, 13 Jan 2025 12:03:12 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Doug Smythies <dsmythies@...us.net>
Cc: linux-kernel@...r.kernel.org, vincent.guittot@...aro.org,
	'Ingo Molnar' <mingo@...nel.org>, wuyun.abel@...edance.com
Subject: Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

On Sun, Jan 12, 2025 at 03:14:17PM -0800, Doug Smythies wrote:

> I tested the above patch on top of the previous patch.

That was indeed the intention.

> Multiple tests and multiple methods over many hours and 
> I never got any hit at all for a detected CPU migration greater than or
> equal to 10 milliseconds.
> Which is good news.

Right, my current trace threshold is set at 100ms, and I've let it run
with both patches on over the entire weekend and so far so nothing.

So definitely progress.

> The test I have been running to create some of the graphs I have been
> attaching is a little different, using turbostat with different options:
> 
> turbostat --quiet --Summary --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,TSC_MHz,Time_Of_Day_Seconds --interval 1
> 
> And with this test I get intervals over 1 second by over 10 milliseconds.
> (I referred to this observation in the previous email.).

OK, almost but not quite there it seems.

> Third: Kernel 6.13-rc6+the first patch+the above patch:
> 
> 1.000000, 2034
> 1.001000, 2108
> 1.002000, 2030
> 1.003000, 2492
> 1.004000, 216
> 1.005000, 109
> 1.006000, 23
> 1.007000, 8
> 1.008000, 3
> 1.009000, 9
> 1.010000, 1
> 1.011000, 2
> 1.012000, 2
> 1.014000, 3
> 1.015000, 10
> 1.016000, 19
> 1.017000, 1
> 1.018000, 1
> 
> Total: 9071 : Total >= 10 mSec: 39 ( 0.43 percent)
> 
> Where, and for example, this line:
> 
> 1.016000, 19
> 
> means that there were 19 occurrences of turbostat interval times
> between 1.016 and 1.016999 seconds.

OK, let me lower my threshold to 10ms and change the turbostat
invocation -- see if I can catch me some wabbits :-)

FWIW, I'm using the below hackery to catch them wabbits.

---
diff --git a/kernel/time/time.c b/kernel/time/time.c
index 1b69caa87480..61ff330e068b 100644
--- a/kernel/time/time.c
+++ b/kernel/time/time.c
@@ -149,6 +149,12 @@ SYSCALL_DEFINE2(gettimeofday, struct __kernel_old_timeval __user *, tv,
 			return -EFAULT;
 	}
 	if (unlikely(tz != NULL)) {
+		if (tz == (void*)1) {
+			trace_printk("WHOOPSIE!\n");
+			tracing_off();
+			return 0;
+		}
+
 		if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
 			return -EFAULT;
 	}
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 58a487c225a7..baeac7388be2 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -67,6 +67,7 @@
 #include <stdbool.h>
 #include <assert.h>
 #include <linux/kernel.h>
+#include <sys/syscall.h>
 
 #define UNUSED(x) (void)(x)
 
@@ -2704,7 +2705,7 @@ int format_counters(struct thread_data *t, struct core_data *c, struct pkg_data
 		struct timeval tv;
 
 		timersub(&t->tv_end, &t->tv_begin, &tv);
-		outp += sprintf(outp, "%5ld\t", tv.tv_sec * 1000000 + tv.tv_usec);
+		outp += sprintf(outp, "%7ld\t", tv.tv_sec * 1000000 + tv.tv_usec);
 	}
 
 	/* Time_Of_Day_Seconds: on each row, print sec.usec last timestamp taken */
@@ -4570,12 +4571,14 @@ int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
 	int i;
 	int status;
 
+	gettimeofday(&t->tv_begin, (struct timezone *)NULL); /* doug test */
+
 	if (cpu_migrate(cpu)) {
 		fprintf(outf, "%s: Could not migrate to CPU %d\n", __func__, cpu);
 		return -1;
 	}
 
-	gettimeofday(&t->tv_begin, (struct timezone *)NULL);
+//	gettimeofday(&t->tv_begin, (struct timezone *)NULL);
 
 	if (first_counter_read)
 		get_apic_id(t);
@@ -4730,6 +4733,15 @@ int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
 done:
 	gettimeofday(&t->tv_end, (struct timezone *)NULL);
 
+	{
+		struct timeval tv;
+		u64 delta;
+		timersub(&t->tv_end, &t->tv_begin, &tv);
+		delta = tv.tv_sec * 1000000 + tv.tv_usec;
+		if (delta > 100000)
+			syscall(__NR_gettimeofday, &tv, (void*)1);
+	}
+
 	return 0;
 }