[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241128103236.22777-1-cpru@amazon.com>
Date: Thu, 28 Nov 2024 04:32:36 -0600
From: Cristian Prundeanu <cpru@...zon.com>
To: <cpru@...zon.com>
CC: <abuehaze@...zon.com>, <alisaidi@...zon.com>, <benh@...nel.crashing.org>,
<blakgeof@...zon.com>, <csabac@...zon.com>, <dietmar.eggemann@....com>,
<doebel@...zon.com>, <gautham.shenoy@....com>, <joseph.salisbury@...cle.com>,
<kprateek.nayak@....com>, <linux-arm-kernel@...ts.infradead.org>,
<linux-kernel@...r.kernel.org>, <linux-tip-commits@...r.kernel.org>,
<mingo@...hat.com>, <peterz@...radead.org>, <x86@...nel.org>
Subject: Re: [PATCH 0/2] [tip: sched/core] sched: Disable PLACE_LAG and RUN_TO_PARITY and move them to sysctl
On 2024-11-26, K Prateek Nayak wrote:
> Would it be possible to use the perf-tool built there to collect
> the scheduling stats for MySQL benchmark runs on both v6.5 and v6.8 and
> share the output of "perf sched stats diff" and the two perf.data files
> recorded?
I'll add this to the list of my next tests. Thank you for mentioning it!
On 2024-11-26, Dietmar Eggemann wrote:
> SUT kernel arm64 (mysql-8.4.0)
> (2) 6.12.0-rc4 -12.9%
> (3) 6.12.0-rc4 NO_PLACE_LAG +6.4%
> (4) v6.12-rc4 SCHED_BATCH +10.8%
This is very interesting; our setups are close, yet I have not seen any
feature or policy combination that performs above the 6.5 CFS baseline.
I look forward to seeing your results with the repro when it's ready.
Did you only use NO_PLACE_LAG or was it together with NO_RUN_TO_PARITY?
Was SCHED_BATCH used with the default feature set (all enabled)?
Which distro/version did you use for the SUT?
> Maybe a difference in our test setup can explain the different test results:
>
> I use:
>
> HammerDB Load Generator <-> MySQL SUT
> 192 VCPUs <-> 16 VCPUs
>
> Virtual users: 256
> Warehouse count: 64
> 3 min rampup
> 10 min test run time
> performance data: NOPM (New Operations Per Minute)
>
> So I have 256 'connection' tasks running on the 16 SUT VCPUS.
My setup:
SUT - 16 vCPUs, 32 GB RAM
Loadgen - 64 vCPU, 128 GB RAM (anything large enough to not be a
bottleneck should work)
Virtual users: 4 x vCPUs = 64
Warehouses: 24
Rampup: 5 min
Test runtime: 20 min x 10 times, each on 4 different SUT/Loadgen pairs
Value recorded: geometric_mean(NOPM)
Powered by blists - more mailing lists