lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aUMUH846NzfQYjZO@gpd4>
Date: Wed, 17 Dec 2025 21:35:43 +0100
From: Andrea Righi <arighi@...dia.com>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>, Tejun Heo <tj@...nel.org>,
	David Vernet <void@...ifault.com>,
	Changwoo Min <changwoo@...lia.com>, Shuah Khan <shuah@...nel.org>,
	Joel Fernandes <joelagnelf@...dia.com>,
	Christian Loehle <christian.loehle@....com>,
	Emil Tsalapatis <emil@...alapatis.com>, sched-ext@...ts.linux.dev,
	bpf@...r.kernel.org, linux-kselftest@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/7] sched_ext: Add a DL server for sched_ext tasks

Hi Juri,

On Wed, Dec 17, 2025 at 04:49:02PM +0100, Juri Lelli wrote:
> Hi!
> 
> On 17/12/25 10:35, Andrea Righi wrote:
> > sched_ext currently suffers starvation due to RT. The same workload when
> > converted to EXT can get zero runtime if RT is 100% running, causing EXT
> > processes to stall. Fix it by adding a DL server for EXT.
> 
> ...
> 
> > v4: - initialize EXT server bandwidth reservation at init time and
> >       always keep it active (Andrea Righi)
> >     - check for rq->nr_running == 1 to determine when to account idle
> >       time (Juri Lelli)
> > v3: - clarify that fair is not the only dl_server (Juri Lelli)
> >     - remove explicit stop to reduce timer reprogramming overhead
> >       (Juri Lelli)
> >     - do not restart pick_task() when it's invoked by the dl_server
> >       (Tejun Heo)
> >     - depend on CONFIG_SCHED_CLASS_EXT (Andrea Righi)
> > v2: - drop ->balance() now that pick_task() has an rf argument
> >       (Andrea Righi)
> > 
> > Tested-by: Christian Loehle <christian.loehle@....com>
> > Co-developed-by: Joel Fernandes <joelagnelf@...dia.com>
> > Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
> > Signed-off-by: Andrea Righi <arighi@...dia.com>
> > ---
> 
> ...
> 
> > @@ -3090,6 +3123,15 @@ static void switching_to_scx(struct rq *rq, struct task_struct *p)
> >  static void switched_from_scx(struct rq *rq, struct task_struct *p)
> >  {
> >  	scx_disable_task(p);
> > +
> > +	/*
> > +	 * After class switch, if the DL server is still active, restart it so
> > +	 * that DL timers will be queued, in case SCX switched to higher class.
> > +	 */
> > +	if (dl_server_active(&rq->ext_server)) {
> > +		dl_server_stop(&rq->ext_server);
> > +		dl_server_start(&rq->ext_server);
> > +	}
> >  }
> 
> We might have discussed this already, in that case I forgot, sorry. But,
> why we do need to start the server again if switched from scx? Couldn't
> make sense of the comment that is already present.

The intention was to restart the DL timers, but thinking more about it,
this appears more harmful than helpful, as it may actually disrupt
accounting.

I did a quick test without the restart and everything seems to work. I'll
run more tests and I'll send an updated patch if everything works well
without the restart.

Thanks!
-Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ