[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aBtRSSCxyHcypo4b@localhost.localdomain>
Date: Wed, 7 May 2025 14:25:45 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Gabriele Monaco <gmonaco@...hat.com>
Cc: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v4 2/5] timers: Add the available mask in timer migration
Le Wed, May 07, 2025 at 09:57:38AM +0200, Gabriele Monaco a écrit :
>
>
> On Tue, 2025-05-06 at 18:07 +0200, Frederic Weisbecker wrote:
> > Le Tue, May 06, 2025 at 11:15:37AM +0200, Gabriele Monaco a écrit :
> > > Keep track of the CPUs available for timer migration in a cpumask.
> > > This
> > > prepares the ground to generalise the concept of unavailable CPUs.
> > >
> > > Signed-off-by: Gabriele Monaco <gmonaco@...hat.com>
> > > ---
> > > kernel/time/timer_migration.c | 12 +++++++++++-
> > > 1 file changed, 11 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/time/timer_migration.c
> > > b/kernel/time/timer_migration.c
> > > index 7efd897c7959..25439f961ccf 100644
> > > --- a/kernel/time/timer_migration.c
> > > +++ b/kernel/time/timer_migration.c
> > > @@ -422,6 +422,9 @@ static unsigned int tmigr_crossnode_level
> > > __read_mostly;
> > >
> > > static DEFINE_PER_CPU(struct tmigr_cpu, tmigr_cpu);
> > >
> > > +/* CPUs available for timer migration */
> > > +static cpumask_var_t tmigr_available_cpumask;
> > > +
> > > #define TMIGR_NONE 0xFF
> > > #define BIT_CNT 8
> > >
> > > @@ -1449,6 +1452,7 @@ static int tmigr_cpu_unavailable(unsigned int
> > > cpu)
> > > raw_spin_lock_irq(&tmc->lock);
> > > tmc->available = false;
> > > WRITE_ONCE(tmc->wakeup, KTIME_MAX);
> > > + cpumask_clear_cpu(cpu, tmigr_available_cpumask);
> > >
> > > /*
> > > * CPU has to handle the local events on his own, when on
> > > the way to
> > > @@ -1459,7 +1463,7 @@ static int tmigr_cpu_unavailable(unsigned int
> > > cpu)
> > > raw_spin_unlock_irq(&tmc->lock);
> > >
> > > if (firstexp != KTIME_MAX) {
> > > - migrator = cpumask_any_but(cpu_online_mask, cpu);
> > > + migrator = cpumask_any(tmigr_available_cpumask);
> >
> > Considering nohz_full CPUs should be still available.
> >
> > I don't think there is anything ensuring that, in nohz_full mode,
> > there must be at least one housekeeping CPU that is not domain
> > isolated.
> >
> > For example if we have two CPUs with CPU 0 being domain isolated
> > and CPU 1 being nohz_full, then there is no migrator to handle CPU
> > 1's
> > global timers.
> >
>
> Mmh, good point, didn't think about having the domain isolated and
> nohz_full maps disjointed..
>
> When that's really the case how do you think we should fall back?
>
> In the situation you describe, no one is going to be able to handle
> global timers on the nohz_full CPUs, right?
>
> When this situation really occurs, we could keep one of the domain
> isolated CPUs in the hierarchy.
> Now, I see on x86 CPU0 cannot be offlined and is not added to
> nohz_full, which would make things considerably easier, but ARM doesn't
> seem to work the same way.
>
> We could elect a lucky winner (e.g. first or last becoming domain
> isolated) and swap it whenever it becomes offline, until we actually
> run out of those (no online cpu non-nohz_full is left), but I believe
> this shouldn't happen..
>
> Does this make sense to you?
Well, nohz_full= and isolcpus=, when they are passed together, must contain the
same set of CPUs. And if there is no housekeeping CPU then one is forced, so
it's well handled at this point.
But if nohz_full= is passed on boot and cpusets later create an isolated
partition which spans the housekeeping set, then the isolated partition must
be rejected.
Thanks.
--
Frederic Weisbecker
SUSE Labs
Powered by blists - more mailing lists