[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121102161815.GA24256@redhat.com>
Date: Fri, 2 Nov 2012 17:18:15 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Mikulas Patocka <mpatocka@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Ananth N Mavinakayanahalli <ananth@...ibm.com>,
Anton Arapov <anton@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] percpu_rw_semaphore: reimplement to not block the
readers unnecessarily
On 11/01, Oleg Nesterov wrote:
>
> On 11/01, Paul E. McKenney wrote:
> >
> > OK, so it looks to me that this code relies on synchronize_sched()
> > forcing a memory barrier on each CPU executing in the kernel.
>
> No, the patch tries to avoid this assumption, but probably I missed
> something.
>
> > 1. A task running on CPU 0 currently write-holds the lock.
> >
> > 2. CPU 1 is running in the kernel, executing a longer-than-average
> > loop of normal instructions (no atomic instructions or memory
> > barriers).
> >
> > 3. CPU 0 invokes percpu_up_write(), calling up_write(),
> > synchronize_sched(), and finally mutex_unlock().
>
> And my expectation was, this should be enough because ...
>
> > 4. CPU 1 executes percpu_down_read(), which calls update_fast_ctr(),
>
> since update_fast_ctr does preempt_disable/enable it should see all
> modifications done by CPU 0.
>
> IOW. Suppose that the writer (CPU 0) does
>
> percpu_done_write();
> STORE;
> percpu_up_write();
>
> This means
>
> STORE;
> synchronize_sched();
> mutex_unlock();
>
> Now. Do you mean that the next preempt_disable/enable can see the
> result of mutex_unlock() but not STORE?
So far I think this is not possible, so the code doesn't need the
additional wstate/barriers.
> > +static bool update_fast_ctr(struct percpu_rw_semaphore *brw, int val)
> > +{
> > + bool success = false;
>
> int state;
>
> > +
> > + preempt_disable();
> > + if (likely(!mutex_is_locked(&brw->writer_mutex))) {
>
> state = ACCESS_ONCE(brw->wstate);
> if (likely(!state)) {
>
> > + __this_cpu_add(*brw->fast_read_ctr, val);
> > + success = true;
>
> } else if (state & WSTATE_NEED_MB) {
> __this_cpu_add(*brw->fast_read_ctr, val);
> smb_mb(); /* Order increment against critical section. */
> success = true;
> }
...
> > +void percpu_up_write(struct percpu_rw_semaphore *brw)
> > +{
> > + /* allow the new readers, but only the slow-path */
> > + up_write(&brw->rw_sem);
>
> ACCESS_ONCE(brw->wstate) = WSTATE_NEED_MB;
>
> > +
> > + /* insert the barrier before the next fast-path in down_read */
> > + synchronize_sched();
But update_fast_ctr() should see mutex_is_locked(), obiously down_write()
must ensure this.
So update_fast_ctr() can execute the WSTATE_NEED_MB code only if it
races with
> ACCESS_ONCE(brw->wstate) = 0;
>
> > + mutex_unlock(&brw->writer_mutex);
these 2 stores and sees them in reverse order.
I guess that mutex_is_locked() in update_fast_ctr() looks a bit confusing.
It means no-fast-path for the reader, we could use ->state instead.
And even ->writer_mutex should go away if we want to optimize the
write-contended case, but I think this needs another patch on top of
this initial implementation.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists