lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yo47ub22k6gbql7nlujsvtefodpqkbf4sbl6fs3p4vr2ejbq7d@x63q5a7krfco>
Date: Fri, 31 Oct 2025 09:46:19 +0000
From: Mel Gorman <mgorman@...hsingularity.net>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>, 
	Juri Lelli <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Valentin Schneider <vschneid@...hat.com>, Chris Mason <clm@...a.com>
Subject: Re: [PATCH 2/2] sched/fair: Reimplement NEXT_BUDDY to align with
 EEVDF goals

On Tue, Oct 28, 2025 at 04:05:45PM +0100, Peter Zijlstra wrote:
> On Mon, Oct 27, 2025 at 01:39:15PM +0000, Mel Gorman wrote:
> >  kernel/sched/fair.c | 137 +++++++++++++++++++++++++++++++++++++-------
> >  1 file changed, 117 insertions(+), 20 deletions(-)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index bc0b7ce8a65d..158e0430449b 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -1193,6 +1203,91 @@ static s64 update_se(struct rq *rq, struct sched_entity *se)
> >  	return delta_exec;
> >  }
> >  
> > +enum preempt_wakeup_action {
> > +	PREEMPT_WAKEUP_NONE,		/* No action on the buddy */
> > +	PREEMPT_WAKEUP_NEXT,		/* Check next is most eligible
> > +					 * before rescheduling.
> > +					 */
> > +	PREEMPT_WAKEUP_RESCHED,		/* Plain reschedule */
> > +};
> > +
> > <SNIP>
> 
> All this seems weirdly placed inside the file. Is there a reason this is
> placed so far away from its only caller?
> 

No, I don't recall why I placed it there and whether it was simply
required by an early prototype.

> >  /*
> >   * Used by other classes to account runtime.
> >   */
> 
> > @@ -7028,8 +7113,6 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >  	hrtick_update(rq);
> >  }
> >  
> > -static void set_next_buddy(struct sched_entity *se);
> > -
> >  /*
> >   * Basically dequeue_task_fair(), except it can deal with dequeue_entity()
> >   * failing half-way through and resume the dequeue later.
> > @@ -8734,7 +8817,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
> >  	struct sched_entity *se = &donor->se, *pse = &p->se;
> >  	struct cfs_rq *cfs_rq = task_cfs_rq(donor);
> >  	int cse_is_idle, pse_is_idle;
> > -	bool do_preempt_short = false;
> > +	enum preempt_wakeup_action do_preempt_short = PREEMPT_WAKEUP_NONE;
> 
> naming seems off; I'm not sure what this still has to do with short.
> Perhaps just preempt_action or whatever?
> 

Good as name as any, may change it to a more meaningful name once I go
through the rest of the review.

> >  
> >  	if (unlikely(se == pse))
> >  		return;
> > @@ -8748,10 +8831,6 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
> >  	if (task_is_throttled(p))
> >  		return;
> >  
> > -	if (sched_feat(NEXT_BUDDY) && !(wake_flags & WF_FORK) && !pse->sched_delayed) {
> > -		set_next_buddy(pse);
> > -	}
> > -
> >  	/*
> >  	 * We can come here with TIF_NEED_RESCHED already set from new task
> >  	 * wake up path.
> > @@ -8783,7 +8862,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
> >  		 * When non-idle entity preempt an idle entity,
> >  		 * don't give idle entity slice protection.
> >  		 */
> > -		do_preempt_short = true;
> > +		do_preempt_short = PREEMPT_WAKEUP_NEXT;
> >  		goto preempt;
> >  	}
> >  
> > @@ -8802,7 +8881,25 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
> >  	 * If @p has a shorter slice than current and @p is eligible, override
> >  	 * current's slice protection in order to allow preemption.
> >  	 */
> > -	do_preempt_short = sched_feat(PREEMPT_SHORT) && (pse->slice < se->slice);
> > +	if (sched_feat(PREEMPT_SHORT) && (pse->slice < se->slice)) {
> > +		do_preempt_short = PREEMPT_WAKEUP_NEXT;
> > +	} else {
> > +		/*
> > +		 * If @p potentially is completing work required by current then
> > +		 * consider preemption.
> > +		 */
> > +		do_preempt_short = __do_preempt_buddy(rq, cfs_rq, wake_flags,
> > +						      pse, se);
> > +	}
> > +
> > +	switch (do_preempt_short) {
> > +	case PREEMPT_WAKEUP_NONE:
> > +		return;
> > +	case PREEMPT_WAKEUP_RESCHED:
> > +		goto preempt;
> > +	case PREEMPT_WAKEUP_NEXT:
> > +		break;
> > +	}
> >  
> >  	/*
> >  	 * If @p has become the most eligible task, force preemption.
> > @@ -8810,7 +8907,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
> >  	if (__pick_eevdf(cfs_rq, !do_preempt_short) == pse)
> >  		goto preempt;
> >  
> > -	if (sched_feat(RUN_TO_PARITY) && do_preempt_short)
> > +	if (sched_feat(RUN_TO_PARITY) && do_preempt_short != PREEMPT_WAKEUP_NONE)
> >  		update_protect_slice(cfs_rq, se);
> 
> WAKEUP_NONE did a return above, I don't think you can get here with
> WAKEUP_NONE, making the above condition always true.

Correct. This was not always the case during development but now the rename
alone highlights issues beyond this oddity.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ