[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150626204612.GA14573@redhat.com>
Date: Fri, 26 Jun 2015 22:46:12 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: paulmck@...ux.vnet.ibm.com, tj@...nel.org, mingo@...hat.com,
der.herr@...r.at, dave@...olabs.net, riel@...hat.com,
viro@...IV.linux.org.uk, torvalds@...ux-foundation.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/6] stop_machine: kill stop_cpus_mutex and
stop_cpus_lock
On 06/26, Peter Zijlstra wrote:
>
> On Fri, Jun 26, 2015 at 04:14:55AM +0200, Oleg Nesterov wrote:
> > Not sure.
> >
> > And note that this series kills stop_cpus_mutex, so that multiple
> > stop_cpus()'s / stop_machine()'s can run in parallel if cpumask's
> > do not overlap.
> >
> > Note also the changelog in 6/6, we can simplify + optimize this code
> > a bit more.
> >
> > What do you think?
>
> The problem I have with this is that it makes the better operation
> (stop_two_cpus) slower while improving the worse operation (stop_cpus).
>
> I would much prefer to keep stop_two_cpus() as proposed with taking two
> cpu_stopper::lock instances and replacing the stop_cpu_mutex with a
> percpu-rwsem.
OK, lets avoid cpumask in stop_two_cpus,
int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *arg)
{
struct multi_stop_data msdata;
struct cpu_stop_done done;
struct cpu_stop_work *work1, *work2;
msdata = (struct multi_stop_data){
.fn = fn,
.data = arg,
.num_threads = 2,
.active_cpus = cpumask_of(cpu1),
};
cpu_stop_init_done(&done, 2);
set_state(&msdata, MULTI_STOP_PREPARE);
if (cpu1 > cpu2)
swap(cpu1, cpu2);
work1 = stop_work_alloc_one(cpu1, true);
work2 = stop_work_alloc_one(cpu1, true);
*work1 = *work2 = (struct cpu_stop_work) {
.fn = multi_cpu_stop,
.arg = &msdata,
.done = &done
};
preempt_disable();
cpu_stop_queue_work(cpu1, work1);
cpu_stop_queue_work(cpu2, work2);
preempt_enable();
wait_for_completion(&done.completion);
stop_work_free_one(cpu1);
stop_work_free_one(cpu2);
wake_up(&stop_work_wq);
return done.executed ? done.ret : -ENOENT;
}
2 cmpxchg()'s vs 2 spin_lock()'s. Plus wake_up(), but we can check
waitqueue_active().
Do you think thi will be noticeably slower?
Of course, if it races with another stop_two_cpus/stop_cpus it will
sleep, but in this case we need to wait anyway.
And I don't think that percpu-rwsem instead of stop_cpu_mutex makes
sense. at least I don't understand how can it help. OK, stop_two_cpus()
can use percpu_down_read() to avoid the deadlock with stop_cpus(), but
you still need double-lock... So I don't think this will make it faster,
this will just penalize stop_cpus(). Or I misunderstood.
So I am still not convinced... But probably I am too biased ;)
Btw. I can't understand the cpu_active() checks in stop_two_cpus().
Do we really need them?
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists