[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZMD5xyxPUkKCDlVQ@localhost.localdomain>
Date: Wed, 26 Jul 2023 12:47:35 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Anna-Maria Behnsen <anna-maria@...utronix.de>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
"Gautham R. Shenoy" <gautham.shenoy@....com>,
Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Valentin Schneider <vschneid@...hat.com>
Subject: Re: Stopping the tick on a fully loaded system
Le Tue, Jul 25, 2023 at 03:07:05PM +0200, Anna-Maria Behnsen a écrit :
> The worst case scenario will not happen, because remote timer expiry only
> happens when CPU is not active in the hierarchy. And with your proposal
> this is valid after tick_nohz_stop_tick().
>
> Nevertheless, I see some problems with this. But this also depends if there
> is the need to change current idle behavior or not. Right now, this are my
> concerns:
>
> - The determinism of tick_nohz_next_event() will break: The return of
> tick_nohz_next_event() will not take into account, if it is the last CPU
> going idle and then has to take care of remote timers. So the first timer
> of the CPU (regardless of global or local) has to be handed back even if
> it could be handled by the hierarchy.
Bah, of course...
>
> - When moving the tmigr_cpu_deactivate() to tick_nohz_stop_tick() and the
> return value of tmigr_cpu_deactivate() is before the ts->next_tick, the
> expiry has to be modified in tick_nohz_stop_tick().
>
> - The load is simply moved to a later place - tick_nohz_stop_tick() is
> never called without a preceding tick_nohz_next_event() call. Yes,
> tick_nohz_next_event() is called under load ~8% more than
> tick_nohz_stop_tick(), but the 'quality' of the return value of
> tick_nohz_next_event() is getting worse.
>
> - timer migration hierarchy is not a standalone timer infrastructure. It
> only makes sense to handle it in combination with the existing timer
> wheel. When the timer base is idle, the timer migration hierarchy with
> the migrators will do the job for global timers. So, I'm not sure about
> the impact of the changed locking - but I'm pretty sure changing that
> increases the probability for ugly races hidden somewhere between the
> lines.
Sure thing, and this won't be pretty.
>
> Thanks,
>
> Anna-Maria
Powered by blists - more mailing lists