[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZnRptXC-ONl-PAyX@slm.duckdns.org>
Date: Thu, 20 Jun 2024 07:41:09 -1000
From: Tejun Heo <tj@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
bristot@...hat.com, vschneid@...hat.com, ast@...nel.org,
daniel@...earbox.net, andrii@...nel.org, martin.lau@...nel.org,
joshdon@...gle.com, brho@...gle.com, pjt@...gle.com,
derkling@...gle.com, haoluo@...gle.com, dvernet@...a.com,
dschatzberg@...a.com, dskarlat@...cmu.edu, riel@...riel.com,
changwoo@...lia.com, himadrics@...ia.fr, memxor@...il.com,
andrea.righi@...onical.com, joel@...lfernandes.org,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
kernel-team@...a.com
Subject: Re: [PATCHSET v6] sched: Implement BPF extensible scheduler class
Hello, Linus.
On Thu, Jun 20, 2024 at 10:11:49AM -0700, Linus Torvalds wrote:
> On Wed, 19 Jun 2024 at 22:07, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > And scx_next_task_picked() isn't pretty - as far as I understand, it's
> > because there's only a "class X picked" callback ("pick_next_task()"),
> > and no way to tell other classes they weren't picked.
>
> I guess that could be a class callback, something like this:
>
> p = class->pick_next_task(rq);
> if (p)
> if (p) {
> - scx_next_task_picked(rq, p, class);
> + struct sched_class *prev = last->sched_class;
> + if (class != prev && prev->switch_class)
> + prev->switch_class(rq);
> return p;
> }
>
> and that would be arguably much prettier. But maybe I've
> mis-understood the reason for that scx_next_task_picked() thing.
Yes, this would work. The callback is there to notify the BPF scheduler when
SCX class is preempted by one of the higher priority classes so that e.g.
the BPF scheduler can punt the task[s] that was running on or waiting for
the CPU to other CPUs. I'll prep a patch to make it a sched_class callback.
There are other hooks which are trickier. e.g. scx_tick() wants to be called
regardless of the class of the current task for the watchdog and
scx_rq_[de]activate() are there for two reasons - 1. sched core doesn't
distinguish actual CPU hotplugs and sched domain updates but the latter
doesn't translate well to BPF schedulers 2. it's nice to give sleeping
context to the BPF scheduler. The fork hooks are in a similar boat as SCX
just needs more synchronization and sleepable context where other classes
don't and likely won't.
I can make all of them callbacks but I'm not sure that'd be all that useful
for other classes and the semantics would be different from other callbacks,
so it's unclear that'd be an overall win.
Thanks.
--
tejun
Powered by blists - more mailing lists