lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZIeMCguf4dZsHwB9@localhost.localdomain>
Date:   Mon, 12 Jun 2023 23:20:10 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Anna-Maria Behnsen <anna-maria@...utronix.de>
Cc:     linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        John Stultz <jstultz@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Eric Dumazet <edumazet@...gle.com>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Arjan van de Ven <arjan@...radead.org>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Rik van Riel <riel@...riel.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Sebastian Siewior <bigeasy@...utronix.de>,
        Giovanni Gherdovich <ggherdovich@...e.cz>,
        Lukasz Luba <lukasz.luba@....com>,
        "Gautham R . Shenoy" <gautham.shenoy@....com>
Subject: Re: [PATCH v7 19/21] timer: Implement the hierarchical pull model

Le Mon, Jun 12, 2023 at 05:57:00PM +0200, Anna-Maria Behnsen a écrit :
> Ok. This problem is around only for nohz_full and not for nohz idle?

Only nohz_full right.

> 
> > The crude trick used by nohz_full in order to re-evaluate the dynticks when
> > a timer is queued from the timer softirq is to launch a self IPI (from
> > trigger_dyntick_cpu()). Here you would need something like that but
> > that's not something we wish either.
> > 
> > In fact we don't want any nohz_full CPU to perform an idle migrator job.
> > And the problem here is that whenever a timer interrupt occurs while
> > tmc->idle == 1 (and that _might_ happen in nohz_full), it will go up the
> > hierarchy as long as there is no active migrator on a given level and
> > check for remote expiry. And if something expired it will not only perform
> > the remote callbacks handling but also reprogram the next tick to fire in
> > the next expiry. That's a potential disturbance on a nohz_full CPU.
> > 
> > There is always an active CPU in nohz_full so there is always an active
> > migrator at least at the top level. Therefore you can spare concurrent
> > idle migrators if they are nohz_full.
> >
> 
> As long as the top level group is not/never idle, the wakeup value will be
> KTIME_MAX and so it is no problem for nohz_full cpus - or? The check for
> handling remote expiry is still a problem but please read my proposal for
> this below.

Good point, I overlooked the fact that data->nextevt is only set if the
top level has no migrator.

So the only issue is that a nohz_full CPU may accidentally check/do the
remote expiry as a loose idle migrator. Which can add noise, etc...

> 
> > My suggestion is to make tmigr_requires_handle_remote() return 0 if
> > tick_nohz_full_cpu(smp_processor_id()), so that we don't even bother
> > raising the softirq. But if the timer softirq happens nevertheless, due
> > to some local timer to process, also make tmigr_handle_remote() to
> > return early.
> 
> Regarding this problem and also the two things you mentioned in the two
> earlier review remarks (timer IRQ which fires too early, IPI when CPU goes
> offline), I would propose to use the tmc->wakeup value slightly different
> as it is used right now:
> 
>  - Whenever a wakeup value is required, because top level group is
>    completely idle, the value is set in per CPU tmc struct. It could be
>    then reevaluated in idle code in irq exit path.

So you want to force reevaluation of tmc->wakeup unconditionally on
deactivate time through tmigr_new_timer(), right? That would indeed work
in any case. At the cost of some more overhead in the idle interrupt path,
but perhaps hardly measurable...

The alternative would be to reset tmc->wakeup only when that deadline is
reached in tmigr_requires_handle_remote(). And then have a special case
in the offlining case. That's less pretty of course.

> 
>  - For checking whether remote expiry is required, the wakeup value could
>    also be used.

Right.

> 
>  - For nohz_full CPUs wakeup will be always KTIME_MAX - under the
>    assumption that there is alwasy an active CPU in top level group.

Sounds good!

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ