lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtC1MryXBOFd8=YJe=MUq75mYM5AK+V+rvhL1rAeAGNKRA@mail.gmail.com>
Date: Fri, 14 Nov 2025 14:36:19 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Shubhang Kaushik OS <Shubhang@...amperecomputing.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, 
	Juri Lelli <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>, Shubhang Kaushik <sh@...two.org>, 
	Shijie Huang <Shijie.Huang@...erecomputing.com>, Frank Wang <zwang@...erecomputing.com>, 
	Christopher Lameter <cl@...two.org>, Adam Li <adam.li@...erecomputing.com>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] sched/fair: Prefer cache locality for EAS wakeup

On Thu, 13 Nov 2025 at 01:26, Shubhang Kaushik OS
<Shubhang@...amperecomputing.com> wrote:
>
> > From your previous answer on v1, I don't think that you use
> > heterogeneous system so eas will not be enabled in your case and even
> > when used find_energy_efficient_cpu() will be called before
>
> I agree that the EAS centric approach in the current patch is misplaced for our homogeneous systems.
>
> > Otherwise you might want to check in wake_affine() where we decide
> > between local cpu and previous cpu which one should be the target.
> > This can have an impact especially if there are not in the same LLC
>
> While wake_affine() modifications seem logical, I see that they cause performance regressions across the board due to the inherent trade-offs in altering that critical initial decision point.
> We might need to solve the non-idle fallback within `select_idle_sibling` to ring fence the impact for preserving locality effectively.

So I'm confused about your topology; Could you share your scheduling topology ?

Also I'm not sure what problem you are trying to solve.
select_idle_sibling() is all about finding an idle CPU that shares
cache with target with target being either local cpu or prev cpu.

If target is prev cpu, we select an idle CPU that shares cache with
prev and the cache locality should be preserved (at least at L3 level
or even cluer level if you have one)

If target is local cpu and it shares cache with prev, the result
should be similar as above at L3 level but maybe not at cluster level

If target is local cpu and it doesn't share cache with prev then
select_idle_sibling() is not the right place and you should look at
wake affine to favor prev cpu is some cases to be defined

Thanks,
Vincent

>
> Thanks,
> Shubhang Kaushik
>
> ________________________________________
> From: Vincent Guittot <vincent.guittot@...aro.org>
> Sent: Monday, November 3, 2025 1:04 AM
> To: Shubhang Kaushik OS
> Cc: Ingo Molnar; Peter Zijlstra; Juri Lelli; Dietmar Eggemann; Steven Rostedt; Ben Segall; Mel Gorman; Valentin Schneider; Shubhang Kaushik; Shijie Huang; Frank Wang; Christopher Lameter; Adam Li; linux-kernel@...r.kernel.org
> Subject: Re: [PATCH v2] sched/fair: Prefer cache locality for EAS wakeup
>
> On Thu, 30 Oct 2025 at 20:19, Shubhang Kaushik via B4 Relay
> <devnull+shubhang.os.amperecomputing.com@...nel.org> wrote:
> >
> > From: Shubhang Kaushik <shubhang@...amperecomputing.com>
> >
> > When Energy Aware Scheduling (EAS) is enabled, a task waking up on a
> > sibling CPU might migrate away from its previous CPU even if that CPU
> > is not overutilized. This sacrifices cache locality and introduces
> > unnecessary migration overhead.
> >
> > This patch refines the wakeup heuristic in `select_idle_sibling()`. If
> > EAS is active and the task's previous CPU (`prev`) is not overutilized,
> > the scheduler will prioritize waking the task on `prev`, avoiding an
> > unneeded migration and preserving cache-hotness.
> >
> > ---
> > v2:
> > - Addressed reviewer comments to handle this special condition
> >   within the selection logic, prioritizing the
> >   previous CPU if not overutilized for EAS.
> > - Link to v1: https://lore.kernel.org/all/20251017-b4-sched-cfs-refactor-propagate-v1-1-1eb0dc5b19b3@os.amperecomputing.com/
> >
> > Signed-off-by: Shubhang Kaushik <shubhang@...amperecomputing.com>
> > ---
> >  kernel/sched/fair.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 25970dbbb27959bc130d288d5f80677f75f8db8b..ac94463627778f09522fb5420f67b903a694ad4d 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7847,9 +7847,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >             asym_fits_cpu(task_util, util_min, util_max, target))
> >                 return target;
> >
> > -       /*
> > -        * If the previous CPU is cache affine and idle, don't be stupid:
> > -        */
> > +       /* Reschedule on an idle, cache-sharing sibling to preserve affinity: */
> >         if (prev != target && cpus_share_cache(prev, target) &&
> >             (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> >             asym_fits_cpu(task_util, util_min, util_max, prev)) {
> > @@ -7861,6 +7859,14 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >                 prev_aff = prev;
> >         }
> >
> > +       /*
> > +        * If the previous CPU is not overutilized, prefer it for cache locality.
> > +        * This prevents migration away from a cache-hot CPU that can still
> > +        * handle the task without causing an overload.
> > +        */
> > +       if (sched_energy_enabled() && !cpu_overutilized(prev))
>
> From your previous answer on v1, I don't think that you use
> heterogeneous system so eas will not be enabled in your case and even
> when used find_energy_efficient_cpu() will be called before
>
> select_idle_sibling looks for an idle cpu that shares the cache with
> target, Isn't such migration inside the same LLC good in your case ?
>
> Otherwise you might want to check in wake_affine() where we decide
> between local cpu and previous cpu which one should be the target.
> This can have an impact especially if there are not in the same LLC
>
> > +               return prev;
> > +
> >         /*
> >          * Allow a per-cpu kthread to stack with the wakee if the
> >          * kworker thread and the tasks previous CPUs are the same.
> >
> > ---
> > base-commit: e53642b87a4f4b03a8d7e5f8507fc3cd0c595ea6
> > change-id: 20251030-b4-follow-up-ff03b4533a2d
> >
> > Best regards,
> > --
> > Shubhang Kaushik <shubhang@...amperecomputing.com>
> >
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ