linux-kernel - Re: [PATCH 07/24] sched/fair: Re-organize dequeue_task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240810221723.GJ11646@noisy.programming.kicks-ass.net>
Date: Sun, 11 Aug 2024 00:17:23 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Valentin Schneider <vschneid@...hat.com>
Cc: mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, linux-kernel@...r.kernel.org,
	kprateek.nayak@....com, wuyun.abel@...edance.com,
	youssefesmat@...omium.org, tglx@...utronix.de, efault@....de
Subject: Re: [PATCH 07/24] sched/fair: Re-organize dequeue_task_fair()

On Fri, Aug 09, 2024 at 06:53:30PM +0200, Valentin Schneider wrote:
> On 27/07/24 12:27, Peter Zijlstra wrote:
> > Working towards delaying dequeue, notably also inside the hierachy,
> > rework dequeue_task_fair() such that it can 'resume' an interrupted
> > hierarchy walk.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> > ---
> >  kernel/sched/fair.c |   61 ++++++++++++++++++++++++++++++++++------------------
> >  1 file changed, 40 insertions(+), 21 deletions(-)
> >
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6861,34 +6861,43 @@ enqueue_task_fair(struct rq *rq, struct
> >  static void set_next_buddy(struct sched_entity *se);
> >
> >  /*
> > - * The dequeue_task method is called before nr_running is
> > - * decreased. We remove the task from the rbtree and
> > - * update the fair scheduling stats:
> > + * Basically dequeue_task_fair(), except it can deal with dequeue_entity()
> > + * failing half-way through and resume the dequeue later.
> > + *
> > + * Returns:
> > + * -1 - dequeue delayed
> > + *  0 - dequeue throttled
> > + *  1 - dequeue complete
> >   */
> > -static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> > +static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
> >  {
> > -	struct cfs_rq *cfs_rq;
> > -	struct sched_entity *se = &p->se;
> > -	int task_sleep = flags & DEQUEUE_SLEEP;
> > -	int idle_h_nr_running = task_has_idle_policy(p);
> >       bool was_sched_idle = sched_idle_rq(rq);
> >       int rq_h_nr_running = rq->cfs.h_nr_running;
> > +	bool task_sleep = flags & DEQUEUE_SLEEP;
> > +	struct task_struct *p = NULL;
> > +	int idle_h_nr_running = 0;
> > +	int h_nr_running = 0;
> > +	struct cfs_rq *cfs_rq;
> >
> > -	util_est_dequeue(&rq->cfs, p);
> > +	if (entity_is_task(se)) {
> > +		p = task_of(se);
> > +		h_nr_running = 1;
> > +		idle_h_nr_running = task_has_idle_policy(p);
> > +	}
> >
> 
> This leaves the *h_nr_running to 0 for non-task entities. IIUC this makes
> sense for ->sched_delayed entities (they should be empty of tasks), not so
> sure for the other case. However, this only ends up being used for non-task
> entities in:
> - pick_next_entity(), if se->sched_delayed
> - unregister_fair_sched_group()
> 
> IIRC unregister_fair_sched_group() can only happen after the group has been
> drained, so it would then indeed be empty of tasks, but I reckon this could
> do with a comment/assert in dequeue_entities(), no? Or did I get too
> confused by cgroups again?
> 

Yeah, so I did have me a patch that made all this work for cfs bandwidth
control as well. And then we need all this for throttled cgroup entries
as well.

Anyway... I had the patch, it worked, but then I remembered you were
going to rewrite all that anyway and I was making a terrible mess of
things, so I made it go away again.