[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5151BC78.3030306@surriel.com>
Date: Tue, 26 Mar 2013 11:19:20 -0400
From: Rik van Riel <riel@...riel.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Michel Lespinasse <walken@...gle.com>,
Sasha Levin <sasha.levin@...cle.com>,
torvalds@...ux-foundation.org, davidlohr.bueso@...com,
linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
hhuang@...hat.com, jason.low2@...com, lwoodman@...hat.com,
chegu_vinod@...com, Dave Jones <davej@...hat.com>,
benisty.e@...il.com, Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH -mm -next] ipc,sem: fix lockdep false positive
On 03/26/2013 10:27 AM, Peter Zijlstra wrote:
> On Tue, 2013-03-26 at 06:40 -0700, Michel Lespinasse wrote:
>
>> sem_nsems is user provided as the array size in some semget system
>> call. It's the size of an ipc semaphore array.
>
> So we're basically adding a random (big) number to preempt_count
> (obviously while preemption is disabled), seems rather costly and
> undesirable.
>
>> complex semop operations take the array's lock plus every semaphore
>> locks; simple semop operations (operating on a single semaphore) only
>> take that one semaphore's lock.
>
> Right, standard global/local lock like stuff. Is there a way we can add
> a r/o test to the 'local' lock operation and avoid doing the above?
That makes me wonder, how did mm_take_all_locks used to work before
we turned the anon_vma lock into a mutex?
The code used to use spin_lock_nest_lock, but still has the potential
to overflow the preempt counter. How did that ever work right?
> Maybe something like:
>
> void sma_lock(struct sem_array *sma) /* global */
> {
> int i;
>
> sma->global_locked = 1;
> smp_wmb(); /* can we merge with the LOCK ? */
> spin_lock(&sma->global_lock);
>
> /* wait for all local locks to go away */
> for (i = 0; i < sma->sem_nsems; i++)
> spin_unlock_wait(&sem->sem_base[i]->lock);
> }
>
> void sma_lock_one(struct sem_array *sma, int nr) /* local */
> {
> smp_rmb(); /* pairs with wmb in sma_lock() */
> if (unlikely(sma->global_locked)) { /* wait for global lock */
> while (sma->global_locked)
> spin_unlock_wait(&sma->global_lock);
> }
> spin_lock(&sma->sem_base[nr]->lock);
> }
That is essentially a read-only version of the global rwlock that
I originally proposed, where the global lock takes the lock for
write and the single version takes the global lock for read, and
then one of the semaphore spinlocks.
I could certainly implement and test the above, unless Linus
thinks it's too ugly to live :)
> This still has the problem of a non-preemptible section of O(sem_nsems)
> (with the avg wait-time on the local lock). Could we make the global
> lock a sleeping lock?
Not without breaking your scheme above :)
I suppose making things into a sleeping lock should be possible,
but that is another major change in this code. I would rather do
things in smaller steps...
--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists