[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150730215527.GQ25159@twins.programming.kicks-ass.net>
Date: Thu, 30 Jul 2015 23:55:27 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Ingo Molnar <mingo@...nel.org>, Rik van Riel <riel@...hat.com>,
Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and
lg_double_lock/unlock()
On Tue, Jul 21, 2015 at 09:22:47PM +0200, Oleg Nesterov wrote:
> +static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
> + int cpu2, struct cpu_stop_work *work2)
> +{
> + struct cpu_stopper *stopper1 = per_cpu_ptr(&cpu_stopper, cpu1);
> + struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2);
> + int err;
> +retry:
> + spin_lock_irq(&stopper1->lock);
> + spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING);
> + /*
> + * If we observe both CPUs active we know _cpu_down() cannot yet have
> + * queued its stop_machine works and therefore ours will get executed
> + * first. Or its not either one of our CPUs that's getting unplugged,
> + * in which case we don't care.
> + */
> + err = -ENOENT;
> + if (!cpu_active(cpu1) || !cpu_active(cpu2))
> + goto unlock;
> +
> + WARN_ON(!stopper1->enabled || !stopper2->enabled);
> + /*
> + * Ensure that if we race with stop_cpus() the stoppers won't
> + * get queued up in reverse order, leading to system deadlock.
> + */
> + err = -EDEADLK;
> + if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
> + goto unlock;
You could DoS/false positive this by running stop_one_cpu() in a loop,
and thereby 'always' having work pending on one but not the other.
(doing so if obviously daft for other reasons)
> +
> + err = 0;
> + __cpu_stop_queue_work(stopper1, work1);
> + __cpu_stop_queue_work(stopper2, work2);
> +unlock:
> + spin_unlock(&stopper2->lock);
> + spin_unlock_irq(&stopper1->lock);
> +
> + if (unlikely(err == -EDEADLK)) {
> + cond_resched();
> + goto retry;
And this just gives me -rt nightmares.
> + }
> + return err;
> +}
As it is, -rt does horrible things to stop_machine, and I would very
much like to make it such that we don't need to do that.
Now, obviously, stop_cpus() is _BAD_ for -rt, and we try real hard to
make sure that doesn't happen, but stop_one_cpu() and stop_two_cpus()
should not be a problem.
Exclusion between stop_{one,two}_cpu{,s}() and stop_cpus() makes this
trivially go away.
Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
also gone. But we're still stuck with stop_machine_from_inactive_cpu()
which does a spin-wait for exclusive state. So I suppose we'll have to
keep stop_cpus_mutex :/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists