[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aUMUH846NzfQYjZO@gpd4>
Date: Wed, 17 Dec 2025 21:35:43 +0100
From: Andrea Righi <arighi@...dia.com>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>, Tejun Heo <tj@...nel.org>,
David Vernet <void@...ifault.com>,
Changwoo Min <changwoo@...lia.com>, Shuah Khan <shuah@...nel.org>,
Joel Fernandes <joelagnelf@...dia.com>,
Christian Loehle <christian.loehle@....com>,
Emil Tsalapatis <emil@...alapatis.com>, sched-ext@...ts.linux.dev,
bpf@...r.kernel.org, linux-kselftest@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/7] sched_ext: Add a DL server for sched_ext tasks
Hi Juri,
On Wed, Dec 17, 2025 at 04:49:02PM +0100, Juri Lelli wrote:
> Hi!
>
> On 17/12/25 10:35, Andrea Righi wrote:
> > sched_ext currently suffers starvation due to RT. The same workload when
> > converted to EXT can get zero runtime if RT is 100% running, causing EXT
> > processes to stall. Fix it by adding a DL server for EXT.
>
> ...
>
> > v4: - initialize EXT server bandwidth reservation at init time and
> > always keep it active (Andrea Righi)
> > - check for rq->nr_running == 1 to determine when to account idle
> > time (Juri Lelli)
> > v3: - clarify that fair is not the only dl_server (Juri Lelli)
> > - remove explicit stop to reduce timer reprogramming overhead
> > (Juri Lelli)
> > - do not restart pick_task() when it's invoked by the dl_server
> > (Tejun Heo)
> > - depend on CONFIG_SCHED_CLASS_EXT (Andrea Righi)
> > v2: - drop ->balance() now that pick_task() has an rf argument
> > (Andrea Righi)
> >
> > Tested-by: Christian Loehle <christian.loehle@....com>
> > Co-developed-by: Joel Fernandes <joelagnelf@...dia.com>
> > Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
> > Signed-off-by: Andrea Righi <arighi@...dia.com>
> > ---
>
> ...
>
> > @@ -3090,6 +3123,15 @@ static void switching_to_scx(struct rq *rq, struct task_struct *p)
> > static void switched_from_scx(struct rq *rq, struct task_struct *p)
> > {
> > scx_disable_task(p);
> > +
> > + /*
> > + * After class switch, if the DL server is still active, restart it so
> > + * that DL timers will be queued, in case SCX switched to higher class.
> > + */
> > + if (dl_server_active(&rq->ext_server)) {
> > + dl_server_stop(&rq->ext_server);
> > + dl_server_start(&rq->ext_server);
> > + }
> > }
>
> We might have discussed this already, in that case I forgot, sorry. But,
> why we do need to start the server again if switched from scx? Couldn't
> make sense of the comment that is already present.
The intention was to restart the DL timers, but thinking more about it,
this appears more harmful than helpful, as it may actually disrupt
accounting.
I did a quick test without the restart and everything seems to work. I'll
run more tests and I'll send an updated patch if everything works well
without the restart.
Thanks!
-Andrea
Powered by blists - more mailing lists