linux-kernel - Re: [PATCH 0/2] convert mmap_sem to a scalable rw

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0705121520210.2101@frodo.shire>
Date:	Sat, 12 May 2007 15:44:19 +0200 (CEST)
From:	Esben Nielsen <nielsen.esben@...glemail.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
cc:	Esben Nielsen <nielsen.esben@...glemail.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, Oleg Nesterov <oleg@...sign.ru>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Nick Piggin <npiggin@...e.de>
Subject: Re: [PATCH 0/2] convert mmap_sem to a scalable rw_mutex

On Sat, 12 May 2007, Peter Zijlstra wrote:

> On Sat, 2007-05-12 at 11:27 +0200, Esben Nielsen wrote:
>>
>> On Fri, 11 May 2007, Peter Zijlstra wrote:
>>
>>>
>>> I was toying with a scalable rw_mutex and found that it gives ~10% reduction in
>>> system time on ebizzy runs (without the MADV_FREE patch).
>>>
>>
>> You break priority enheritance on user space futexes! :-(
>> The problems is that the futex waiter have to take the mmap_sem. And as
>> your rw_mutex isn't PI enabled you get priority inversions :-(
>
> Do note that rwsems have no PI either.
> PI is not a concern for mainline - yet, I do have ideas here though.
>
>
If PI wasn't a concern for mainline, why is PI futexes merged into the 
mainline?

I notice that the rwsems used now isn't priority inversion safe (thus 
destroyingthe perpose of having PI futexes). We thus already have a bug in 
the mainline.

I suggest making a rw_mutex which does read side PI: A reader boosts the 
writer, but a writer can't boost the readers, since there can be a large 
amount of those.

I don't have time to make such a rw_mutex but I have a simple idea for 
one, where the rt_mutex can be reused.

  struct pi_rw_mutex {
       int count; /*  0   -> unoccupied,
                      >0 -> the number of current readers,
                            Second highest bit: there are a waiting writer
                      -1 -> A writer have it. */
       struct rt_mutex mutex;
  }

Use atomic exchange on count.

When locking:

A writer checks if count <= 0. If so it sets the value to -1 and takes
the mutex. When it gets the mutex it rechecks the count and proceeds.
If count > 0 the writer sets the second highest bit and add itself to
the wait-list in the mutex and sleeps. (The mutex will now be in a state 
where owner==NULL but there are waiters. It must be cheched if the 
rt_mutex code can handle this.)

A reader checks if count >= 0. If so it does count++ and proceeds.
If count < 0 it takes the rtmutex. When it gets the mutex it sets the 
count to 1, unlocks the mutex and proceeds.

When unlocking:

The writer sets count to 0 or 0x8000000 (second highest bit) depending 
on how many waiters the mutex have and unlocks the mutex.

The reader checks if count is 0x80000001. If so it sets count to 0 and 
wakes up the first waiter on the mutex (if there are any). Otherwise it 
just do count--.

Esben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/