lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 12 Jun 2023 17:57:00 +0200 (CEST)
From:   Anna-Maria Behnsen <anna-maria@...utronix.de>
To:     Frederic Weisbecker <frederic@...nel.org>
cc:     linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        John Stultz <jstultz@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Eric Dumazet <edumazet@...gle.com>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Arjan van de Ven <arjan@...radead.org>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Rik van Riel <riel@...riel.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Sebastian Siewior <bigeasy@...utronix.de>,
        Giovanni Gherdovich <ggherdovich@...e.cz>,
        Lukasz Luba <lukasz.luba@....com>,
        "Gautham R . Shenoy" <gautham.shenoy@....com>
Subject: Re: [PATCH v7 19/21] timer: Implement the hierarchical pull model

On Mon, 12 Jun 2023, Frederic Weisbecker wrote:

> Le Wed, May 24, 2023 at 09:06:27AM +0200, Anna-Maria Behnsen a écrit :
> > +void tmigr_handle_remote(void)
> > +{
> > +	struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu);
> > +	struct tmigr_remote_data data;
> > +
> > +	if (!is_tmigr_enabled() || !tmc->tmgroup || !tmc->online)
> > +		return;
> > +
> > +	/*
> > +	 * NOTE: This is a doubled check because migrator test will be done
> > +	 * in tmigr_handle_remote_up() anyway. Keep this check to fasten
> > +	 * the return when nothing has to be done.
> > +	 */
> > +	if (!tmigr_check_migrator(tmc->tmgroup, tmc->childmask))
> > +		return;
> > +
> > +	data.now = get_jiffies_update(&data.basej);
> > +	data.childmask = tmc->childmask;
> > +	data.wakeup = KTIME_MAX;
> > +
> > +	__walk_groups(&tmigr_handle_remote_up, &data, tmc);
> > +
> > +	raw_spin_lock_irq(&tmc->lock);
> > +	if (tmc->idle)
> > +		tmc->wakeup = data.wakeup;
> 
> So this assumes there will be a timer re-eavluation right after
> in the idle path but this is not true in the case of nohz_full.
> 
> If the softirq happens at the tail of the timer interrupt, you're
> good because the dynticks is re-evaluated right after. But if the
> softirq happens in ksoftirqd, you'll have no timer re-evaluation.

Ok. This problem is around only for nohz_full and not for nohz idle?

> The crude trick used by nohz_full in order to re-evaluate the dynticks when
> a timer is queued from the timer softirq is to launch a self IPI (from
> trigger_dyntick_cpu()). Here you would need something like that but
> that's not something we wish either.
> 
> In fact we don't want any nohz_full CPU to perform an idle migrator job.
> And the problem here is that whenever a timer interrupt occurs while
> tmc->idle == 1 (and that _might_ happen in nohz_full), it will go up the
> hierarchy as long as there is no active migrator on a given level and
> check for remote expiry. And if something expired it will not only perform
> the remote callbacks handling but also reprogram the next tick to fire in
> the next expiry. That's a potential disturbance on a nohz_full CPU.
> 
> There is always an active CPU in nohz_full so there is always an active
> migrator at least at the top level. Therefore you can spare concurrent
> idle migrators if they are nohz_full.
>

As long as the top level group is not/never idle, the wakeup value will be
KTIME_MAX and so it is no problem for nohz_full cpus - or? The check for
handling remote expiry is still a problem but please read my proposal for
this below.

> My suggestion is to make tmigr_requires_handle_remote() return 0 if
> tick_nohz_full_cpu(smp_processor_id()), so that we don't even bother
> raising the softirq. But if the timer softirq happens nevertheless, due
> to some local timer to process, also make tmigr_handle_remote() to
> return early.

Regarding this problem and also the two things you mentioned in the two
earlier review remarks (timer IRQ which fires too early, IPI when CPU goes
offline), I would propose to use the tmc->wakeup value slightly different
as it is used right now:

 - Whenever a wakeup value is required, because top level group is
   completely idle, the value is set in per CPU tmc struct. It could be
   then reevaluated in idle code in irq exit path.

 - For checking whether remote expiry is required, the wakeup value could
   also be used.

 - For nohz_full CPUs wakeup will be always KTIME_MAX - under the
   assumption that there is alwasy an active CPU in top level group.

What do you think about this? Do I miss something else here?

Thanks,

	Anna-Maria

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ