[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5151E62E.9000703@surriel.com>
Date: Tue, 26 Mar 2013 14:17:18 -0400
From: Rik van Riel <riel@...riel.com>
To: Sasha Levin <sasha.levin@...cle.com>
CC: Davidlohr Bueso <davidlohr.bueso@...com>,
torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, hhuang@...hat.com, jason.low2@...com,
walken@...gle.com, lwoodman@...hat.com, chegu_vinod@...com,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: ipc,sem: sysv semaphore scalability
On 03/26/2013 02:07 PM, Sasha Levin wrote:
> On 03/26/2013 01:51 PM, Davidlohr Bueso wrote:
>> On Tue, 2013-03-26 at 13:33 -0400, Sasha Levin wrote:
>>> On 03/20/2013 03:55 PM, Rik van Riel wrote:
>>>> This series makes the sysv semaphore code more scalable,
>>>> by reducing the time the semaphore lock is held, and making
>>>> the locking more scalable for semaphore arrays with multiple
>>>> semaphores.
>>>
>>> Hi Rik,
>>>
>>> Another issue that came up is:
>>>
>>> [ 96.347341] ================================================
>>> [ 96.348085] [ BUG: lock held when returning to user space! ]
>>> [ 96.348834] 3.9.0-rc4-next-20130326-sasha-00011-gbcb2313 #318 Tainted: G W
>>> [ 96.360300] ------------------------------------------------
>>> [ 96.361084] trinity-child9/7583 is leaving the kernel with locks still held!
>>> [ 96.362019] 1 lock held by trinity-child9/7583:
>>> [ 96.362610] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8192eafb>] SYSC_semtimedop+0x1fb/0xec0
>>>
>>> It seems that we can leave semtimedop without releasing the rcu read lock.
>>>
>>> I'm a bit confused by what's going on in semtimedop with regards to rcu read lock, it
>>> seems that this behaviour is actually intentional?
>>>
>>> rcu_read_lock();
>>> sma = sem_obtain_object_check(ns, semid);
>>> if (IS_ERR(sma)) {
>>> if (un)
>>> rcu_read_unlock();
>>> error = PTR_ERR(sma);
>>> goto out_free;
>>> }
>>>
>>> When I've looked at that it seems that not releasing the read lock was (very)
>>> intentional.
>>
>> This logic was from the original code, which I also found to be quite
>> confusing.
>
> I wasn't getting this warning with the old code, so there was probably something
> else that triggers this now.
>
>>>
>>> After that, the only code path that would release the lock starts with:
>>>
>>> if (un) {
>>> ...
>>>
>>> So we won't release the lock at all if un is NULL?
>>>
>>
>> Not necessarily, we do release everything at the end of the function:
>>
>> out_unlock_free:
>> sem_unlock(sma, locknum);
>
> Ow, there's a rcu_read_unlock() in sem_unlock()? This complicates things even
> more I suspect. If un is non-NULL we'll be unlocking rcu lock twice?
It is uglier than you think...
On success, find_alloc_undo will call rcu_read_lock, so we have the
rcu_read_lock held twice :(
Some of the ipc code is quite ugly, but making too many large changes
at once is just asking for trouble. I suspect we're going to have to
untangle this one bit at a time...
--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists