[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121210172410.GA28479@redhat.com>
Date: Mon, 10 Dec 2012 18:24:10 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Cc: tglx@...utronix.de, peterz@...radead.org,
paulmck@...ux.vnet.ibm.com, rusty@...tcorp.com.au,
mingo@...nel.org, akpm@...ux-foundation.org, namhyung@...nel.org,
vincent.guittot@...aro.org, tj@...nel.org, sbw@....edu,
amit.kucheria@...aro.org, rostedt@...dmis.org, rjw@...k.pl,
wangyun@...ux.vnet.ibm.com, xiaoguangrong@...ux.vnet.ibm.com,
nikunj@...ux.vnet.ibm.com, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPU
offline from atomic context
On 12/10, Srivatsa S. Bhat wrote:
>
> On 12/10/2012 01:52 AM, Oleg Nesterov wrote:
> > On 12/10, Srivatsa S. Bhat wrote:
> >>
> >> On 12/10/2012 12:44 AM, Oleg Nesterov wrote:
> >>
> >>> But yes, it is easy to blame somebody else's code ;) And I can't suggest
> >>> something better at least right now. If I understand correctly, we can not
> >>> use, say, synchronize_sched() in _cpu_down() path
> >>
> >> We can't sleep in that code.. so that's a no-go.
> >
> > But we can?
> >
> > Note that I meant _cpu_down(), not get_online_cpus_atomic() or take_cpu_down().
> >
>
> Maybe I'm missing something, but how would it help if we did a
> synchronize_sched() so early (in _cpu_down())? Another bunch of preempt_disable()
> sections could start immediately after our call to synchronize_sched() no?
> How would we deal with that?
Sorry for confusion. Of course synchronize_sched() alone is not enough.
But we can use it to synchronize with preempt-disabled section and avoid
the barriers/atomic in the fast-path.
For example,
bool writer_pending;
DEFINE_RWLOCK(writer_rwlock);
DEFINE_PER_CPU(int, reader_ctr);
void get_online_cpus_atomic(void)
{
preempt_disable();
if (likely(!writer_pending) || __this_cpu_read(reader_ctr)) {
__this_cpu_inc(reader_ctr);
return;
}
read_lock(&writer_rwlock);
__this_cpu_inc(reader_ctr);
read_unlock(&writer_rwlock);
}
// lacks release semantics, but we don't care
void put_online_cpus_atomic(void)
{
__this_cpu_dec(reader_ctr);
preempt_enable();
}
Now, _cpu_down() does
writer_pending = true;
synchronize_sched();
before stop_one_cpu(). When synchronize_sched() returns, we know that
every get_online_cpus_atomic() must see writer_pending == T. And, if
any CPU incremented its reader_ctr we must see it is not zero.
take_cpu_down() does
write_lock(&writer_rwlock);
for_each_online_cpu(cpu) {
while (per_cpu(reader_ctr, cpu))
cpu_relax();
}
and takes the lock.
However. This can lead to the deadlock we already discussed. So
take_cpu_down() should do
retry:
write_lock(&writer_rwlock);
for_each_online_cpu(cpu) {
if (per_cpu(reader_ctr, cpu)) {
write_unlock(&writer_rwlock);
goto retry;
}
}
to take the lock. But this is livelockable. However, I do not think it
is possible to avoid the livelock.
Just in case, the code above is only for illustration, perhaps it is not
100% correct and perhaps we can do it better. cpu_hotplug.active_writer
is ignored for simplicity, get/put should check current == active_writer.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists