[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9aa93862-932c-4a17-a3ba-f6335649e555@arm.com>
Date: Tue, 26 Nov 2024 16:12:04 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Cristian Prundeanu <cpru@...zon.com>
Cc: kprateek.nayak@....com, abuehaze@...zon.com, alisaidi@...zon.com,
benh@...nel.crashing.org, blakgeof@...zon.com, csabac@...zon.com,
doebel@...zon.com, gautham.shenoy@....com, joseph.salisbury@...cle.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-tip-commits@...r.kernel.org, mingo@...hat.com, peterz@...radead.org,
x86@...nel.org
Subject: Re: [PATCH 0/2] [tip: sched/core] sched: Disable PLACE_LAG and
RUN_TO_PARITY and move them to sysctl
On 25/11/2024 12:35, Cristian Prundeanu wrote:
> Here are more results with recent 6.12 code, and also using SCHED_BATCH.
> The control tests were run anew on Ubuntu 22.04 with the current pre-built
> kernels 6.5 (baseline) and 6.8 (regression out of the box).
>
> When updating mysql from 8.0.30 to 8.4.2, the regression grew even larger.
> Disabling PLACE_LAG and RUN _TO_PARITY improved the results more than
> using SCHED_BATCH.
>
> Kernel | default | NO_PLACE_LAG and | SCHED_BATCH | mysql
> | config | NO_RUN_TO_PARITY | | version
> ---------+----------+------------------+-------------+---------
> 6.8 | -15.3% | | | 8.0.30
> 6.12-rc7 | -11.4% | -9.2% | -11.6% | 8.0.30
> | | | |
> 6.8 | -18.1% | | | 8.4.2
> 6.12-rc7 | -14.0% | -10.2% | -12.7% | 8.4.2
> ---------+----------+------------------+-------------+---------
>
> Confidence intervals for all tests are smaller than +/- 0.5%.
>
> I expect to have the repro package ready by the end of the week. Thank you
> for your collective patience and efforts to confirm these results.
The results I got look different:
SUT kernel arm64 (mysql-8.4.0)
(1) 6.5.13 baseline
(2) 6.12.0-rc4 -12.9%
(3) 6.12.0-rc4 NO_PLACE_LAG +6.4%
(4) v6.12-rc4 SCHED_BATCH +10.8%
5 test runs each: confidence level (95%) <= ±0.56%
(2) is still in sync but (3)/(4) looks way better for me.
Maybe a difference in our test setup can explain the different test results:
I use:
HammerDB Load Generator <-> MySQL SUT
192 VCPUs <-> 16 VCPUs
Virtual users: 256
Warehouse count: 64
3 min rampup
10 min test run time
performance data: NOPM (New Operations Per Minute)
So I have 256 'connection' tasks running on the 16 SUT VCPUS.
> On 2024-11-01, Peter Zijlstra wrote:
>
>>> (At the risk of stating the obvious, using SCHED_BATCH only to get back to
>>> the default CFS performance is still only a workaround,
>>
>> It is not really -- it is impossible to schedule all the various
>> workloads without them telling us what they really like. The quest is to
>> find interfaces that make sense and are implementable. But fundamentally
>> tasks will have to start telling us what they need. We've long since ran
>> out of crystal balls.
>
> Completely agree that the best performance is obtained when the tasks are
> individually tuned to the scheduler and explicitly set running parameters.
> This isn't different from before.
>
> But shouldn't our gold standard for default performance be CFS? There is a
> significant regression out of the box when using EEVDF; how is seeking
> additional tuning just to recover the lost performance not a workaround?
>
> (Not to mention that this additional tuning means shifting the burden on
> many users who may not be familiar enough with scheduler functionality.
> We're essentially asking everyone to spend considerable effort to maintain
> status quo from kernel 6.5.)
>
>
> On 2024-11-14, Joseph Salisbury wrote:
>
>> This is a confirmation that we are also seeing a 9% performance
>> regression with the TPCC benchmark after v6.6-rc1. We narrowed down the
>> regression was caused due to commit:
>> 86bfbb7ce4f6 ("sched/fair: Add lag based placement")
>>
>> This regression was reported via this thread:
>> https://lore.kernel.org/lkml/1c447727-92ed-416c-bca1-a7ca0974f0df@oracle.com/
>>
>> Phil Auld suggested to try turning off the PLACE_LAG sched feature. We
>> tested with NO_PLACE_LAG and can confirm it brought back 5% of the
>> performance loss. We do not yet know what effect NO_PLACE_LAG will have
>> on other benchmarks, but it indeed helps TPCC.
>
> Thank you for confirming the regression. I've been monitoring performance
> on the v6.12-rcX tags since this thread started, and the results have been
> largely constant.
>
> I've also tested other benchmarks to verify whether (1) the regression
> exists and (2) the patch proposed in this thread negatively affects them.
> On postgresql and wordpress/nginx there is a regression which is improved
> when applying the patch; on mongo and mariadb no regression manifested, and
> the patch did not make their performance worse.
>
>
> On 2024-11-19, Dietmar Eggemann wrote:
>
>> #cat /etc/systemd/system/mysql.service
>>
>> [Service]
>> CPUSchedulingPolicy=batch
>> ExecStart=/usr/local/mysql/bin/mysqld_safe
>
> This is the approach I used as well to get the results above.
OK.
>> My hunch is that this is due to the 'connection' threads (1 per virtual
>> user) running in SCHED_BATCH. I yet have to confirm this by only
>> changing the 'connection' tasks to SCHED_BATCH.
>
> Did you have a chance to run with this scenario?
Yeah, I did. The results where worse than running all mysqld threads in
SCHED_BATCH but still better than the baseline.
(5) v6.12-rc4 'connection' tasks in SCHED_BATCH +6.8%
Powered by blists - more mailing lists