lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240417115446.2908769-3-luogengkun@huaweicloud.com>
Date: Wed, 17 Apr 2024 11:54:46 +0000
From: Luo Gengkun <luogengkun@...weicloud.com>
To: peterz@...radead.org
Cc: mingo@...hat.com,
	acme@...nel.org,
	mark.rutland@....com,
	alexander.shishkin@...ux.intel.com,
	jolsa@...nel.org,
	namhyung@...nel.org,
	irogers@...gle.com,
	adrian.hunter@...el.com,
	linux-perf-users@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	luogengkun@...weicloud.com
Subject: [PATCH RESEND v2 2/2] perf/core: Fix incorrected time diff in tick adjust period

Adrian found that there is a probability that the number of samples
is small, which is caused by the unreasonable large sampling period.

 # taskset --cpu 0 perf record -F 1000 -e cs -- taskset --cpu 1 ./test
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.010 MB perf.data (204 samples) ]
 # perf script
 ...
 test   865   265.377846:         16 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.378900:         15 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.379845:         14 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.380770:         14 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.381647:         15 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.382638:         16 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.383647:         16 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.384704:         15 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.385649:         14 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.386578:        152 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.396383:        154 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.406183:        154 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.415839:        154 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.425445:        154 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.435052:        154 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.444708:        154 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.454314:        154 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.463970:        154 cs:  ffffffff832e927b schedule+0x2b
 test   865   265.473577:        154 cs:  ffffffff832e927b schedule+0x2b
 ...

It seems that the Timer Interrupts is not coming every TICK_NSEC when
system is idle. For example, counter increase n during 2 * TICK_NSEC,
and it call perf_adjust_period using n and TICK_NSEC, so the final period
we calculated will be bigger than expected one. What's more, if the
the overflow time is larger than 2 * TICK_NSEC we cannot tune the period
using __perf_event_account_interrupt. To fix this problem, perf can
calculate the tick interval by itself.

Signed-off-by: Luo Gengkun <luogengkun@...weicloud.com>
---
 include/linux/perf_event.h |  1 +
 kernel/events/core.c       | 15 ++++++++++++---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index afb028c54f33..2708f1d0692c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -265,6 +265,7 @@ struct hw_perf_event {
 	 * State for freq target events, see __perf_event_overflow() and
 	 * perf_adjust_freq_unthr_context().
 	 */
+	u64				freq_tick_stamp;
 	u64				freq_time_stamp;
 	u64				freq_count_stamp;
 #endif
diff --git a/kernel/events/core.c b/kernel/events/core.c
index cad50d3439f1..0f2025d631aa 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4112,7 +4112,7 @@ perf_adjust_freq_unthr_context(struct perf_event_context *ctx, bool unthrottle)
 {
 	struct perf_event *event;
 	struct hw_perf_event *hwc;
-	u64 now, period = TICK_NSEC;
+	u64 now, period, tick_stamp;
 	s64 delta;
 
 	/*
@@ -4151,6 +4151,10 @@ perf_adjust_freq_unthr_context(struct perf_event_context *ctx, bool unthrottle)
 		 */
 		event->pmu->stop(event, PERF_EF_UPDATE);
 
+		tick_stamp = perf_clock();
+		period = tick_stamp - hwc->freq_tick_stamp;
+		hwc->freq_tick_stamp = tick_stamp;
+
 		now = local64_read(&event->count);
 		delta = now - hwc->freq_count_stamp;
 		hwc->freq_count_stamp = now;
@@ -4162,8 +4166,13 @@ perf_adjust_freq_unthr_context(struct perf_event_context *ctx, bool unthrottle)
 		 * to perf_adjust_period() to avoid stopping it
 		 * twice.
 		 */
-		if (delta > 0)
-			perf_adjust_period(event, period, delta, false);
+		if (delta > 0) {
+			/*
+			 * we skip first tick adjust period
+			 */
+			if (likely(period != tick_stamp))
+				perf_adjust_period(event, period, delta, false);
+		}
 
 		event->pmu->start(event, delta > 0 ? PERF_EF_RELOAD : 0);
 	next:
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ