[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50D61561.7090805@linux.vnet.ibm.com>
Date: Sun, 23 Dec 2012 01:47:37 +0530
From: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To: Oleg Nesterov <oleg@...hat.com>
CC: tglx@...utronix.de, peterz@...radead.org,
paulmck@...ux.vnet.ibm.com, rusty@...tcorp.com.au,
mingo@...nel.org, akpm@...ux-foundation.org, namhyung@...nel.org,
vincent.guittot@...aro.org, tj@...nel.org, sbw@....edu,
amit.kucheria@...aro.org, rostedt@...dmis.org, rjw@...k.pl,
wangyun@...ux.vnet.ibm.com, xiaoguangrong@...ux.vnet.ibm.com,
nikunj@...ux.vnet.ibm.com, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU offline
from atomic context
On 12/20/2012 07:12 PM, Oleg Nesterov wrote:
> On 12/20, Srivatsa S. Bhat wrote:
>>
>> On 12/20/2012 12:44 AM, Oleg Nesterov wrote:
>>>
>>> We need 2 helpers for writer, the 1st one does synchronize_sched() and the
>>> 2nd one takes rwlock. A generic percpu_write_lock() simply calls them both.
>>>
>>
>> Ah, that's the problem no? Users of reader-writer locks expect to run in
>> atomic context (ie., they don't want to sleep).
>
> Ah, I misunderstood.
>
> Sure, percpu_write_lock() should be might_sleep(), and this is not
> symmetric to percpu_read_lock().
>
>> We can't expose an API that
>> can make the task go to sleep under the covers!
>
> Why? Just this should be documented. However I would not worry until we
> find another user. Until then we do not even need to add percpu_write_lock
> or try to generalize this code too much.
>
>>> To me, the main question is: can we use synchronize_sched() in cpu_down?
>>> It is slow.
>>>
>>
>> Haha :-) So we don't want smp_mb() in the reader,
>
> We need mb() + rmb(). Plust cli/sti unless this arch has optimized
> this_cpu_add() like x86 (as you pointed out).
>
Hey, IIUC, we actually don't need mb() in the reader!! Just an rmb() will do.
This is the reader code I have so far:
#define reader_nested_percpu() \
(__this_cpu_read(reader_percpu_refcnt) & READER_REFCNT_MASK)
#define writer_active() \
(__this_cpu_read(writer_signal))
#define READER_PRESENT (1UL << 16)
#define READER_REFCNT_MASK (READER_PRESENT - 1)
void get_online_cpus_atomic(void)
{
preempt_disable();
/*
* First and foremost, make your presence known to the writer.
*/
this_cpu_add(reader_percpu_refcnt, READER_PRESENT);
/*
* If we are already using per-cpu refcounts, it is not safe to switch
* the synchronization scheme. So continue using the refcounts.
*/
if (reader_nested_percpu()) {
this_cpu_inc(reader_percpu_refcnt);
} else {
smp_rmb();
if (unlikely(writer_active())) {
... //take hotplug_rwlock
}
}
...
/* Prevent reordering of any subsequent reads of cpu_online_mask. */
smp_rmb();
}
The smp_rmb() before writer_active() ensures that LOAD(writer_signal) follows
LOAD(reader_percpu_refcnt) (at the 'if' condition). And in turn, that load is
automatically going to follow the STORE(reader_percpu_refcnt) (at this_cpu_add())
due to the data dependency. So it is something like a transitive relation.
So, the result is that, we mark ourselves as active in reader_percpu_refcnt before
we check writer_signal. This is exactly what we wanted to do right?
And luckily, due to the dependency, we can achieve it without using the heavy
smp_mb(). And, we can't crib about the smp_rmb() because it is unavoidable anyway
(because we want to prevent reordering of the reads to cpu_online_mask, like you
pointed out earlier).
I hope I'm not missing anything...
Regards,
Srivatsa S. Bhat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists