linux-kernel - Re: ipc,sem: sysv semaphore scalability

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5151E62E.9000703@surriel.com>
Date:	Tue, 26 Mar 2013 14:17:18 -0400
From:	Rik van Riel <riel@...riel.com>
To:	Sasha Levin <sasha.levin@...cle.com>
CC:	Davidlohr Bueso <davidlohr.bueso@...com>,
	torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, hhuang@...hat.com, jason.low2@...com,
	walken@...gle.com, lwoodman@...hat.com, chegu_vinod@...com,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: ipc,sem: sysv semaphore scalability

On 03/26/2013 02:07 PM, Sasha Levin wrote:
> On 03/26/2013 01:51 PM, Davidlohr Bueso wrote:
>> On Tue, 2013-03-26 at 13:33 -0400, Sasha Levin wrote:
>>> On 03/20/2013 03:55 PM, Rik van Riel wrote:
>>>> This series makes the sysv semaphore code more scalable,
>>>> by reducing the time the semaphore lock is held, and making
>>>> the locking more scalable for semaphore arrays with multiple
>>>> semaphores.
>>>
>>> Hi Rik,
>>>
>>> Another issue that came up is:
>>>
>>> [   96.347341] ================================================
>>> [   96.348085] [ BUG: lock held when returning to user space! ]
>>> [   96.348834] 3.9.0-rc4-next-20130326-sasha-00011-gbcb2313 #318 Tainted: G        W
>>> [   96.360300] ------------------------------------------------
>>> [   96.361084] trinity-child9/7583 is leaving the kernel with locks still held!
>>> [   96.362019] 1 lock held by trinity-child9/7583:
>>> [   96.362610]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8192eafb>] SYSC_semtimedop+0x1fb/0xec0
>>>
>>> It seems that we can leave semtimedop without releasing the rcu read lock.
>>>
>>> I'm a bit confused by what's going on in semtimedop with regards to rcu read lock, it
>>> seems that this behaviour is actually intentional?
>>>
>>>          rcu_read_lock();
>>>          sma = sem_obtain_object_check(ns, semid);
>>>          if (IS_ERR(sma)) {
>>>                  if (un)
>>>                          rcu_read_unlock();
>>>                  error = PTR_ERR(sma);
>>>                  goto out_free;
>>>          }
>>>
>>> When I've looked at that it seems that not releasing the read lock was (very)
>>> intentional.
>>
>> This logic was from the original code, which I also found to be quite
>> confusing.
>
> I wasn't getting this warning with the old code, so there was probably something
> else that triggers this now.
>
>>>
>>> After that, the only code path that would release the lock starts with:
>>>
>>>          if (un) {
>>> 		...
>>>
>>> So we won't release the lock at all if un is NULL?
>>>
>>
>> Not necessarily, we do release everything at the end of the function:
>>
>> out_unlock_free:
>> 	sem_unlock(sma, locknum);
>
> Ow, there's a rcu_read_unlock() in sem_unlock()? This complicates things even
> more I suspect. If un is non-NULL we'll be unlocking rcu lock twice?

It is uglier than you think...

On success, find_alloc_undo will call rcu_read_lock, so we have the
rcu_read_lock held twice :(

Some of the ipc code is quite ugly, but making too many large changes
at once is just asking for trouble. I suspect we're going to have to
untangle this one bit at a time...


-- 
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/