[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250212093721.GA24784@noisy.programming.kicks-ass.net>
Date: Wed, 12 Feb 2025 10:37:21 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Cristian Prundeanu <cpru@...zon.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>,
Hazem Mohamed Abuelfotoh <abuehaze@...zon.com>,
Ali Saidi <alisaidi@...zon.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Geoff Blake <blakgeof@...zon.com>, Csaba Csoma <csabac@...zon.com>,
Bjoern Doebel <doebel@...zon.com>,
Gautham Shenoy <gautham.shenoy@....com>,
Joseph Salisbury <joseph.salisbury@...cle.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ingo Molnar <mingo@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Borislav Petkov <bp@...en8.de>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-tip-commits@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH v2] [tip: sched/core] sched: Move PLACE_LAG and
RUN_TO_PARITY to sysctl
On Wed, Feb 12, 2025 at 10:17:11AM +0100, Peter Zijlstra wrote:
> On Tue, Feb 11, 2025 at 11:36:44PM -0600, Cristian Prundeanu wrote:
> > Replacing CFS with the EEVDF scheduler in kernel 6.6 introduced
> > significant performance degradation in multiple database-oriented
> > workloads. This degradation manifests in all kernel versions using EEVDF,
> > across multiple Linux distributions, hardware architectures (x86_64,
> > aarm64, amd64), and CPU generations.
> >
> > Testing combinations of available scheduler features showed that the
> > largest improvement (short of disabling all EEVDF features) came from
> > disabling both PLACE_LAG and RUN_TO_PARITY.
> >
> > Moving PLACE_LAG and RUN_TO_PARITY to sysctl will allow users to override
> > their default values and persist them with established mechanisms.
>
> Nope -- you have knobs in debugfs, and that's where they'll stay. Esp.
> PLACE_LAG is super dodgy and should not get elevated to anything
> remotely official.
Just to clarify, the problem with NO_PLACE_LAG is that by discarding
lag, a task can game the system to 'gain' time. It fundamentally breaks
fairness, and the only reason I implemented it at all was because it is
one of the 'official' placement strategies in the original paper.
But ideally, it should just go, it is not a sound strategy and relies on
tasks behaving themselves.
That is, assuming your tasks behave like the traditional periodic or
sporadic tasks, then it works, but only because the tasks are limited by
the constraints of the task model.
If the tasks are unconstrained / aperiodic, this goes out the window and
the placement strategy becomes unsound. And given we must assume
userspace to be malicious / hostile / unbehaved, the whole thing is just
not good.
It is for this same reason that SCHED_DEADLINE has a constant bandwidth
server on top of the earliest deadline first policy. Pure EDF is only
sound for periodic / sporadic tasks, but we cannot assume userspace will
behave themselves, so we have to put in guard-rails, CBS in this case.
Powered by blists - more mailing lists