linux-kernel - Re: [PATCH] time/tick-broadcast: Fix tick_broadcast

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190625020506.GQ26519@linux.ibm.com>
Date:   Mon, 24 Jun 2019 19:05:06 -0700
From:   "Paul E. McKenney" <paulmck@...ux.ibm.com>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, mingo@...hat.com, tglx@...utronix.de
Subject: Re: [PATCH] time/tick-broadcast: Fix tick_broadcast_offline()
 lockdep complaint

On Tue, Jun 25, 2019 at 02:43:00AM +0200, Frederic Weisbecker wrote:
> On Mon, Jun 24, 2019 at 04:44:22PM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 25, 2019 at 01:12:23AM +0200, Frederic Weisbecker wrote:
> > > On Fri, Jun 21, 2019 at 04:46:02PM -0700, Paul E. McKenney wrote:
> > > > @@ -3097,13 +3126,21 @@ static void sched_tick_remote(struct work_struct *work)
> > > >  	/*
> > > >  	 * Run the remote tick once per second (1Hz). This arbitrary
> > > >  	 * frequency is large enough to avoid overload but short enough
> > > > -	 * to keep scheduler internal stats reasonably up to date.
> > > > +	 * to keep scheduler internal stats reasonably up to date.  But
> > > > +	 * first update state to reflect hotplug activity if required.
> > > >  	 */
> > > > +	os = atomic_read(&twork->state);
> > > > +	if (os) {
> > > > +		WARN_ON_ONCE(os != TICK_SCHED_REMOTE_OFFLINING);
> > > > +		if (atomic_inc_not_zero(&twork->state))
> > > > +			return;
> > > 
> > > Using inc makes me a bit nervous here. If we do so, we should somewhow
> > > make sure that we never exceed a value higher than TICK_SCHED_REMOTE_OFFLINE
> > > by accident.
> > > 
> > > atomic_xchg() is probably a bit costlier but also safer as it allows
> > > us to check both the old and the new value. That path shouldn't be critically fast
> > > after all.
> > 
> > It would need to be cmpxchg() to avoid messing with the state if
> > the state were somehow TICK_SCHED_REMOTE_RUNNING, right?
> 
> Ah indeed! Nevermind, let's keep things as they are then.
> 
> > > > +	}
> > > >  	queue_delayed_work(system_unbound_wq, dwork, HZ);
> > > >  }
> > > >  
> > > >  static void sched_tick_start(int cpu)
> > > >  {
> > > > +	int os;
> > > >  	struct tick_work *twork;
> > > >  
> > > >  	if (housekeeping_cpu(cpu, HK_FLAG_TICK))
> > > > @@ -3112,15 +3149,20 @@ static void sched_tick_start(int cpu)
> > > >  	WARN_ON_ONCE(!tick_work_cpu);
> > > >  
> > > >  	twork = per_cpu_ptr(tick_work_cpu, cpu);
> > > > -	twork->cpu = cpu;
> > > > -	INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
> > > > -	queue_delayed_work(system_unbound_wq, &twork->work, HZ);
> > > > +	os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
> > > > +	WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
> > > 
> > > See if we use atomic_inc(), we would need to also WARN(os > TICK_SCHED_REMOTE_OFFLINE).
> > 
> > How about if I put that WARN() between the atomic_inc_not_zero() and
> > the return, presumably also adding braces?
> 
> Yeah, unfortunately there is no atomic_add_not_zero_return().
> I guess we can live with a check using atomic_read(). In the best
> case it returns the fresh increment, otherwise it should be REMOTE_RUNNING.
> 
> In any case the (os > TICK_SCHED_REMOTE_OFFLINE) check applies.

True, so with high probability a warning would be emitted.  Fair enough?

							Thanx, Paul