lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANDhNCosvML9XKH8HV1QBXKzxt3zMWxzsPSUfYxoyPCORf6Krw@mail.gmail.com>
Date: Wed, 9 Apr 2025 19:41:03 -0700
From: John Stultz <jstultz@...gle.com>
To: hupu <hupu.gm@...il.com>
Cc: linux-kernel@...r.kernel.org, juri.lelli@...hat.com, peterz@...radead.org, 
	vschneid@...hat.com, mingo@...hat.com, vincent.guittot@...aro.org, 
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com, 
	mgorman@...e.de, hupu@...nssion.com
Subject: Re: [RFC 1/1] sched: Skip redundant operations for proxy tasks
 needing return migration

On Mon, Apr 7, 2025 at 6:47 AM hupu <hupu.gm@...il.com> wrote:
>
> Move the proxy_needs_return() check earlier in ttwu_runnable() to avoid
> unnecessary scheduling operations when a proxy task requires return
> migration to its original CPU.
>
> The current implementation performs several operations (rq clock update,
> enqueue, and wakeup preemption checks) before checking for return
> migration needs. This is inefficient because:
>
> 1. For tasks needing return migration, these operations are redundant
>    since the task will be dequeued from current rq anyway
> 2. The task may not even be allowed to run on current CPU due to
>    possible affinity changes during blocking
> 3. The proper CPU selection will be handled by select_task_rq() in
>    the subsequent try_to_wake_up() logic
>
> By moving the proxy_needs_return() check to the beginning, we:
> - Avoid unnecessary rq clock updates
> - Skip redundant enqueue operations
> - Eliminate meaningless wakeup preemption checks
> - Let the normal wakeup path handle proper CPU selection
>
> This optimization is particularly valuable in proxy execution scenarios
> where tasks frequently migrate between CPUs.

Hey hupu! Thanks again for your review of the proxy series and your
sharing of this improvement!

The proposal sounds attractive, however...

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ca4ca739eb85..ebb4bc1800e3 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4162,6 +4162,10 @@ static int ttwu_runnable(struct task_struct *p, int wake_flags)
>
>         rq = __task_rq_lock(p, &rf);
>         if (task_on_rq_queued(p)) {
> +               if (proxy_needs_return(rq, p)) {
> +                       _trace_sched_pe_return_migration(p);
> +                       goto out;
> +               }
>                 update_rq_clock(rq);
>                 if (p->se.sched_delayed) {
>                         proxy_remove_from_sleeping_owner(p);
> @@ -4174,10 +4178,6 @@ static int ttwu_runnable(struct task_struct *p, int wake_flags)
>                          */
>                         wakeup_preempt(rq, p, wake_flags);
>                 }
> -               if (proxy_needs_return(rq, p)) {
> -                       _trace_sched_pe_return_migration(p);
> -                       goto out;
> -               }
>                 ttwu_do_wakeup(p);
>                 ret = 1;
>         }
> --

Unfortunately this patch crashes pretty quickly in my testing. The
first issue was proxy_needs_return() calls deactivate_task() w/
DEQUEUE_NOCLOCK, which causes warnings when the update_rq_clock()
hasn't been called. Preserving the update_rq_clock() line before
checking proxy_needs_return() avoided that issue, but then I saw hangs
during bootup, which I suspect is due to us shortcutting over the
sched_delayed case.

Moving the proxy_needs_return above the if(task_on_cpu())
wakeup_preempt() logic booted ok, but I'm still a little hesitant of
what side-effects that might cause.

Part of the approach with proxy_needs_return() from ttwu_runnable() is
making sure everything is ready for the queued task to run, but in the
case where we need return migration, to dequeue the task and cause
ttwu_runnable() to fail, forcing the rest of try_to_wake_up() to
complete.  I do really appreciate your thoughts here though, as I
agree that the potential enqueue/dequeue/enqueue could be wasteful.

I'll take a bit more time to see how we might be able to avoid this
extra work without skipping needed steps. If you have further ideas,
I'd love to hear them!

thanks again!
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ