lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <q3krlyukweyfrabk2soxryx74mjl6yljqfm7nhfrhudbv47q4p@62unggrnbydk>
Date: Thu, 13 Nov 2025 09:04:38 +0000
From: Mel Gorman <mgorman@...hsingularity.net>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Valentin Schneider <vschneid@...hat.com>, 
	Chris Mason <clm@...a.com>, Madadi Vineeth Reddy <vineethr@...ux.ibm.com>, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] sched/fair: Reimplement NEXT_BUDDY to align with
 EEVDF goals

On Wed, Nov 12, 2025 at 03:48:23PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 12, 2025 at 12:25:21PM +0000, Mel Gorman wrote:
> 
> > +	/* Prefer picking wakee soon if appropriate. */
> > +	if (sched_feat(NEXT_BUDDY) &&
> > +	    set_preempt_buddy(cfs_rq, wake_flags, pse, se)) {
> > +
> > +		/*
> > +		 * Decide whether to obey WF_SYNC hint for a new buddy. Old
> > +		 * buddies are ignored as they may not be relevant to the
> > +		 * waker and less likely to be cache hot.
> > +		 */
> > +		if (wake_flags & WF_SYNC)
> > +			preempt_action = preempt_sync(rq, wake_flags, pse, se);
> > +	}
> 
> Why only do preempt_sync() when NEXT_BUDDY? Nothing there seems to
> depend on buddies.

There isn't a direct relation, but there is an indirect one. I know from
your previous review that you separated out the WF_SYNC but after a while,
I did not find a good reason to separate it completely from NEXT_BUDDY.

NEXT_BUDDY updates cfs_rq->next if appropriate to indicate there is a waker
relationship between two tasks and potentially share data that may still
be cache resident after a context switch. WF_SYNC indicates there may be
a strict relationship between those two tasks that the waker may need the
wakee to do some work before it can make progress. If NEXT_BUDDY does not
set cfs_rq->next in the current waking context then the wakee may only be
picked next by coincidence under normal EEVDF rules.

WF_SYNC could still reschedule if the wakee is not selected as a buddy but
the benefit, if any, would be marginal -- if the waker does not go to sleep
then WF_SYNC contract is violated and if the data becomes cache cold after
a wakeup delay then the shared data may already be evicted from cache.
With NEXT_BUDDY, there is a chance that the cost of a reschedule and/or
a context switch will be offset by reduced overall latency (e.g. fewer
cache misses). Without NEXT_BUDDY, WF_SYNC may only incur costs due to
context switching.

I considered the possibility of WF_SYNC being applied if pse is already a
buddy due to yield or some other factor but there is no reason to assume
any shared data is still cache resident and it's not easy to reason about. I
considered applying WF_SYNC if pse was already set and use it as a two-pass
filter but again, no obvious benefit or why the second wakeup ie more
important than the first wakeup. I considered WF_SYNC being applied if
any buddy is set but it's not clear why a SYNC wakeup between tasks A,B
should instead pick C to run ASAP outside of the normal EEVDF rules.

I think it's straight-forward if the logic is

	o If NEXT_BUDDY sets the wakee becomes cfs_rq->next then
	  schedule the wakee soon
	o If the wakee is to be selected soon and WF_SYNC is also set then
	  pick the wakee ASAP

but less straight-forward if

	o If WF_SYNC is set, reschedule now and maybe the wakee will be
	  picked, maybe the waker will run again, maybe something else
	  will run and sometimes it'll be a gain overall.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ