[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 12 Dec 2012 19:48:49 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Cc: tglx@...utronix.de, peterz@...radead.org,
paulmck@...ux.vnet.ibm.com, rusty@...tcorp.com.au,
mingo@...nel.org, akpm@...ux-foundation.org, namhyung@...nel.org,
vincent.guittot@...aro.org, tj@...nel.org, sbw@....edu,
amit.kucheria@...aro.org, rostedt@...dmis.org, rjw@...k.pl,
wangyun@...ux.vnet.ibm.com, xiaoguangrong@...ux.vnet.ibm.com,
nikunj@...ux.vnet.ibm.com, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU
offline from atomic context
On 12/13, Srivatsa S. Bhat wrote:
>
> On 12/12/2012 11:32 PM, Oleg Nesterov wrote:
> > And _perhaps_ get_ can avoid it too?
> >
> > I didn't really try to think, probably this is not right, but can't
> > something like this work?
> >
> > #define XXXX (1 << 16)
> > #define MASK (XXXX -1)
> >
> > void get_online_cpus_atomic(void)
> > {
> > preempt_disable();
> >
> > // only for writer
> > __this_cpu_add(reader_percpu_refcnt, XXXX);
> >
> > if (__this_cpu_read(reader_percpu_refcnt) & MASK) {
> > __this_cpu_inc(reader_percpu_refcnt);
> > } else {
> > smp_wmb();
> > if (writer_active()) {
> > ...
> > }
> > }
> >
> > __this_cpu_dec(reader_percpu_refcnt, XXXX);
> > }
> >
>
> Sorry, may be I'm too blind to see, but I didn't understand the logic
> of how the mask helps us avoid disabling interrupts..
Why do we need cli/sti at all? We should prevent the following race:
- the writer already holds hotplug_rwlock, so get_ must not
succeed.
- the new reader comes, it increments reader_percpu_refcnt,
but before it checks writer_active() ...
- irq handler does get_online_cpus_atomic() and sees
reader_nested_percpu() == T, so it simply increments
reader_percpu_refcnt and succeeds.
OTOH, why do we need to increment reader_percpu_refcnt the counter
in advance? To ensure that either we see writer_active() or the
writer should see reader_percpu_refcnt != 0 (and that is why they
should write/read in reverse order).
The code above tries to avoid this race using the lower 16 bits
as a "nested-counter", and the upper bits to avoid the race with
the writer.
// only for writer
__this_cpu_add(reader_percpu_refcnt, XXXX);
If irq comes and does get_online_cpus_atomic(), it won't be confused
by __this_cpu_add(XXXX), it will check the lower bits and switch to
the "slow path".
But once again, so far I didn't really try to think. It is quite
possible I missed something.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists