[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y5haDh3sYUFcXkBx@hirez.programming.kicks-ass.net>
Date: Tue, 13 Dec 2022 11:55:10 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Tejun Heo <tj@...nel.org>
Cc: torvalds@...ux-foundation.org, mingo@...hat.com,
juri.lelli@...hat.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
martin.lau@...nel.org, joshdon@...gle.com, brho@...gle.com,
pjt@...gle.com, derkling@...gle.com, haoluo@...gle.com,
dvernet@...a.com, dschatzberg@...a.com, dskarlat@...cmu.edu,
riel@...riel.com, linux-kernel@...r.kernel.org,
bpf@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH 14/31] sched_ext: Implement BPF extensible scheduler class
On Mon, Dec 12, 2022 at 11:33:12AM -1000, Tejun Heo wrote:
> > But this.. afaict that means that:
> >
> > - the whole EXT thing is incompatible with SCHED_CORE
>
> Can you expand on why this would be? I didn't test against SCHED_CORE, so am
> sure things might be broken but can't think of a reason why it'd be
> fundamentally incompatible.
For starters, SCHED_CORE doesn't use __pick_next_task() (much). But I
think you're going to have more trouble with prio_less() (which is the
3rd implementation of the scheduling function :/)
> > - the whole EXT thing can be trivially starved by the presence of a
> > single CFS/BATCH/IDLE task.
>
> It's a simliar situation w/ RT vs. CFS, which is resolved via RT having
> starvation avoidance.
That is a horrible situation as is, FIFO/RR are very crap scheduling
policies for a number of reasons but we're stuck with them due to
history and POSIX :-(, that is not something you should justify anything
with.
In fact, it should be an example of what to avoid.
Specifically, FIFO/RR fail at the fundamentals of OS
abstractions -- they provide neither resource distribution nor
isolation.
> Here, the way it's handled is a bit different, SCX has
> a watchdog mechanism implemented in "[PATCH 18/31] sched_ext: Implement
> runnable task stall watchdog", so if SCX tasks hang for whatever reason
> including being starved by CFS, it will get aborted and all tasks will be
> handed back to CFS. IOW, it's treated like any other BPF scheduler errors
> that can lead to stalls and recovered the same way.
That all sounds quite terrible.. :/
When the scheduler isn't available it should be an error to switch a
task to the policy, when there are tasks in the policy, it must not go
away.
The policy itself should never cause policy changes.
Powered by blists - more mailing lists