linux-kernel - Re: [PATCH v6 19/21] timer: Implement the hierarchical pull model

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZGNJq_ZITmZ5YciL@localhost.localdomain>
Date:   Tue, 16 May 2023 11:15:23 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Sebastian Siewior <bigeasy@...utronix.de>
Cc:     Anna-Maria Behnsen <anna-maria@...utronix.de>,
        linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        John Stultz <jstultz@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Eric Dumazet <edumazet@...gle.com>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Arjan van de Ven <arjan@...radead.org>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Rik van Riel <riel@...riel.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Giovanni Gherdovich <ggherdovich@...e.cz>,
        Lukasz Luba <lukasz.luba@....com>,
        "Gautham R . Shenoy" <gautham.shenoy@....com>
Subject: Re: [PATCH v6 19/21] timer: Implement the hierarchical pull model

Le Mon, May 15, 2023 at 12:19:36PM +0200, Sebastian Siewior a écrit :
> On 2023-05-10 12:32:53 [+0200], Frederic Weisbecker wrote:
> > In the case of !PREEMPT_RT, I suppose it's impossible for the target
> > CPU to be offline. You checked above tmc->online and in-between the
> > call to timer_lock_remote_bases(), the path is BH-disabled, this prevents
> > stop_machine from running and from setting the CPU as offline.
> 
> I think you refer to the last one invoked from takedown_cpu(). This does
> not matter, see below.
> 
> What bothers me is that _current_ CPU is check for cpu_is_offline() and
> not the variable 'cpu'. Before the check timer_expire_remote() is
> invoked on 'cpu' and not on current.

Oh right!

> 
> > However in PREEMPT_RT, ksoftirqd (or timersd) is preemptible, so it seems
> > that it could happen in theory. And that could create a locking imbalance.
> 
> The ksoftirqd thread is part of smpboot_park_threads(). They have to
> stop running and clean up before the machinery continues bringing down
> the CPU (that is before takedown_cpu()). On the way down we have:
> - tmigr_cpu_offline() followed by
> - smpboot_park_threads().
> 
> So ksoftirqd (preempted or not) finishes before. This is for the
> _target_ CPU.

Ok I forgot about the smpboot cleanup part.

> 
> After the "tmc->online" check the lock is dropped and this is invoked
> from run_timer_softirq(). That means that _this_ CPU could get preempted
> (by an IRQ for instance) at this point, and once the CPU gets back here,
> the remote CPU (as specified in `cpu') can already be offline by the
> time timer_lock_remote_bases() is invoked.
> 
> So RT or not, this is racy.

Well, all CPUs must schedule to stop machine on take_cpu_down().

So:

  //CPU 1                            //CPU 2
  softirq()
     tmigr_handle_remote_cpu()
        LOCK(tmc->lock)
           if (!tmc->online)
              UNLOCK(tmc->lock)
              return;
                                     cpu_down()
                                        tmigr_cpu_offline()
                                           LOCK(tmc->lock)
                                           tmc->online = 0
                                           UNLOCK(tmc->lock)
                                        stop_machine()
                                           //wait for CPU 1  
                                           poll on MULTI_STOP_PREPARE
        if (cpu_is_offline(2))
           //not possible
   //end of softirq
   stop_machine()
                                           set MULTI_STOP_PREPARE
                                           ...
                                           set_cpu_online(0)


Things should be fine on !RT but I may easily be missing something.

As for RT it should be fine as well as you pointed out since CPU 1
can be preempted but the CPU still needs to park the kthreads before joining
the stop machine party.

> 
> > My suggestion would be to unconditionally lock the bases, you already checked if
> > !tmc->online before. The remote CPU may have gone down since then because the
> > tmc lock has been relaxed but it should be rare enough that you don't care
> > about optimizing with a lockless check. So you can just lock the bases,
> > lock the tmc and check again if tmc->online. If not then you can just ignore
> > the tmigr_new_timer_up call and propagation.
> 
> Regardless the previous point, this still looks odd as you pointed out.
> The return code is ignored and the two functions perform lock + unlock
> depending on it.

Agreed!