[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20231020090606.GA31411@noisy.programming.kicks-ass.net>
Date: Fri, 20 Oct 2023 11:06:06 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: K Prateek Nayak <kprateek.nayak@....com>
Cc: Anna-Maria Behnsen <anna-maria@...utronix.de>,
linux-kernel@...r.kernel.org, John Stultz <jstultz@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Eric Dumazet <edumazet@...gle.com>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Arjan van de Ven <arjan@...radead.org>,
"Paul E . McKenney" <paulmck@...nel.org>,
Frederic Weisbecker <frederic@...nel.org>,
Rik van Riel <riel@...riel.com>,
Steven Rostedt <rostedt@...dmis.org>,
Sebastian Siewior <bigeasy@...utronix.de>,
Giovanni Gherdovich <ggherdovich@...e.cz>,
Lukasz Luba <lukasz.luba@....com>,
"Gautham R . Shenoy" <gautham.shenoy@....com>,
Srinivas Pandruvada <srinivas.pandruvada@...el.com>
Subject: Re: [PATCH v8 00/25] timer: Move from a push remote at enqueue to a
pull at expiry model
On Fri, Oct 06, 2023 at 10:35:43AM +0530, K Prateek Nayak wrote:
> Hello Anna-Maria,
>
> On 10/4/2023 6:04 PM, Anna-Maria Behnsen wrote:
> > [..snip..]
> >
> > Ping Pong Oberservation
> > ^^^^^^^^^^^^^^^^^^^^^^^
> >
> > During testing on a mostly idle machine a ping pong game could be observed:
> > a process_timeout timer is expired remotely on a non idle CPU. Then the CPU
> > where the schedule_timeout() was executed to enqueue the timer comes out of
> > idle and restarts the timer using schedule_timeout() and goes back to idle
> > again. This is due to the fair scheduler which tries to keep the task on
> > the CPU which it previously executed on.
>
> Regarding above, are you referring to "wake_up_process(timeout->task)" in
> "process_timeout()" ends up waking the task on an idle CPU instead of the
> CPU where process_timeout() ran?
>
> In which case, have you tried using the "WF_CURRENT_CPU" flag for the
> wakeup? (landed upstream in v6.6-rc1) It is only used by wait queues in
> kernel/sched/wait.c currently but perhaps we can have a
> "wake_up_process_on_current_cpu()" that process_timeout() can call.
>
> Something along the lines of:
>
> int wake_up_process_on_current_cpu(struct task_struct *p)
> {
> return try_to_wake_up(p, TASK_NORMAL, WF_CURRENT_CPU);
> }
> EXPORT_SYMBOL(wake_up_process_on_current_cpu);
>
> Thoughts?
Yeah, we should definitely not export such a function. Also,
WF_CURRENT_CPU should be used sparingly.
So restarting the task on the previous CPU is done because of cache
affinity and is typically the right thing to do if you care about
performance.
The first question to ask is where this schedule_timeout() is coming
from. Is this some daft userspace that is polling on something that
eventually ends up in schedule_timeout() ?
Can we fix the userspace to not do silly things like that?
The alternative is adding heuristics and we all know where that ends :/
Powered by blists - more mailing lists