[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZitPTnLnHmIX4NWL@xsang-OptiPlex-9020>
Date: Fri, 26 Apr 2024 14:53:02 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Anna-Maria Behnsen <anna-maria@...utronix.de>
CC: "oe-lkp@...ts.linux.dev" <oe-lkp@...ts.linux.dev>, lkp <lkp@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Thomas
Gleixner" <tglx@...utronix.de>, "Huang, Ying" <ying.huang@...el.com>, "Tang,
Feng" <feng.tang@...el.com>, "Yin, Fengwei" <fengwei.yin@...el.com>, Frederic
Weisbecker <frederic@...nel.org>, "Rafael J. Wysocki" <rafael@...nel.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>, "linux-pm@...r.kernel.org"
<linux-pm@...r.kernel.org>, <oliver.sang@...el.com>
Subject: Re: [linus:master] [timers] 7ee9887703:
stress-ng.uprobe.ops_per_sec -17.1% regression
hi, Anna-Maria,
On Thu, Apr 25, 2024 at 04:23:17PM +0800, Anna-Maria Behnsen wrote:
> Hi,
>
> (adding cpuidle/power people to cc-list)
>
> Oliver Sang <oliver.sang@...el.com> writes:
>
> > hi, Frederic Weisbecker,
> >
> > On Tue, Apr 02, 2024 at 12:46:15AM +0200, Frederic Weisbecker wrote:
> >> Le Wed, Mar 27, 2024 at 04:39:17PM +0800, kernel test robot a écrit :
> >> >
> >> >
> >> > Hello,
> >> >
> >> >
> >> > we reported
> >> > "[tip:timers/core] [timers] 7ee9887703: netperf.Throughput_Mbps -1.2% regression"
> >> > in
> >> > https://lore.kernel.org/all/202403011511.24defbbd-oliver.sang@intel.com/
> >> >
> >> > now we noticed this commit is in mainline and we captured further results.
> >> >
> >> > still include netperf results for complete. below details FYI.
> >> >
> >> >
> >> > kernel test robot noticed a -17.1% regression of stress-ng.uprobe.ops_per_sec
> >> > on:
> >>
> >> The good news is that I can reproduce.
> >> It has made me spot something already:
> >>
> >> https://lore.kernel.org/lkml/ZgsynV536q1L17IS@pavilion.home/T/#m28c37a943fdbcbadf0332cf9c32c350c74c403b0
> >>
> >> But that's not enough to fix the regression. Investigation continues...
> >
> > Thanks a lot for information! if you want us test any patch, please let us know.
>
> Oliver, I would be happy to see, whether the patch at the end of the
> message restores the original behaviour also in your test setup. I
> applied it on 6.9-rc4. This patch is not a fix - it is just a pointer to
> the kernel path, that might cause the regression. I know, it is
> probable, that a warning in tick_sched is triggered. This happens when
> the first timer is alredy in the past. I didn't add an extra check when
> creating the 'defacto' timer thingy. But existing code handles this
> problem already properly. So the warning could be ignored here.
yes, the patch restores the original behaviour in our test setup.
and right, we saw a WARNING:at_kernel/time/tick-sched.c:#tick_nohz_next_event
I also applied the patch upon 6.9-rc4, then build 6.9-rc4 and 6.9-rc4+patch
with same config (attached), by same test we made original report, we got
below data [1].
from (a) in [1], we just see very similar v6.9-rc4 data with 7ee9887703 data
in our original report, and v6.9-rc4+patch data is very similar to 57e95a5c41
(the parent of 7ee9887703).
though you said the warning could be ignored, I still attach one dmesg in case
you want to have a look. (BTW, the WARNING happened twice in this dmesg)
[1]
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/uprobe/stress-ng/60s
commit:
v6.9-rc4
v6.9-rc4+patch
v6.9-rc4 afc95ee83a86426924f100321bd
---------------- ---------------------------
%stddev %change %stddev
\ | \
8450322 +17.4% 9923989 cpuidle..usage
0.50 ± 6% +0.1 0.60 ± 8% mpstat.cpu.all.sys%
7588 +2.5% 7774 vmstat.system.cs
143938 +15.6% 166345 vmstat.system.in
222151 +13.5% 252196 time.minor_page_faults
100.50 ± 6% +23.7% 124.33 ± 9% time.percent_of_cpu_this_job_got
60.65 ± 6% +23.4% 74.87 ± 9% time.system_time
133973 +2.6% 137487 time.voluntary_context_switches
222151 +13.5% 252196 stress-ng.time.minor_page_faults
100.50 ± 6% +23.7% 124.33 ± 9% stress-ng.time.percent_of_cpu_this_job_got
60.65 ± 6% +23.4% 74.87 ± 9% stress-ng.time.system_time
133973 +2.6% 137487 stress-ng.time.voluntary_context_switches
996193 +21.3% 1208081 stress-ng.uprobe.ops
16600 +21.3% 20132 stress-ng.uprobe.ops_per_sec <----- (a)
8542 ± 2% +4.4% 8920 ± 3% proc-vmstat.nr_active_anon
8542 ± 2% +4.4% 8920 ± 3% proc-vmstat.nr_zone_active_anon
1387019 +6.2% 1473416 proc-vmstat.numa_hit
1060772 +7.1% 1135960 proc-vmstat.numa_local
326227 +3.4% 337389 proc-vmstat.numa_other
1457285 +6.0% 1545091 proc-vmstat.pgalloc_normal
700003 +4.9% 734444 proc-vmstat.pgfault
1268538 +7.8% 1367139 proc-vmstat.pgfree
9.152e+08 +6.3% 9.728e+08 ± 2% perf-stat.i.branch-instructions
2.60 ± 2% -0.1 2.46 perf-stat.i.branch-miss-rate%
12.07 ± 2% +0.9 12.96 ± 2% perf-stat.i.cache-miss-rate%
4068158 +12.1% 4559133 perf-stat.i.cache-misses
30326543 +7.8% 32700896 perf-stat.i.cache-references
7.997e+09 ± 3% +14.3% 9.138e+09 ± 4% perf-stat.i.cpu-cycles
4.453e+09 +6.1% 4.724e+09 ± 2% perf-stat.i.instructions
0.51 -7.4% 0.47 perf-stat.i.ipc
0.91 ± 2% +5.7% 0.96 ± 3% perf-stat.overall.MPKI
3.74 -0.2 3.53 ± 2% perf-stat.overall.branch-miss-rate%
13.36 ± 2% +0.5 13.89 ± 2% perf-stat.overall.cache-miss-rate%
1.80 +7.7% 1.93 ± 2% perf-stat.overall.cpi
0.56 -7.1% 0.52 ± 2% perf-stat.overall.ipc
8.993e+08 +6.3% 9.563e+08 ± 2% perf-stat.ps.branch-instructions
3983131 ± 2% +12.2% 4467972 perf-stat.ps.cache-misses
29818286 +7.9% 32162191 perf-stat.ps.cache-references
7.857e+09 ± 3% +14.3% 8.983e+09 ± 4% perf-stat.ps.cpu-cycles
4.376e+09 +6.1% 4.645e+09 ± 2% perf-stat.ps.instructions
2.684e+11 +6.4% 2.856e+11 ± 2% perf-stat.total.instructions
View attachment "config-6.9.0-rc4-00001-gafc95ee83a86" of type "text/plain" (191428 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (37528 bytes)
Powered by blists - more mailing lists