linux-kernel - Re: [RFC] [PATCH] Pre-emption control for userspace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53175D54.1020804@oracle.com>
Date:	Wed, 05 Mar 2014 10:22:28 -0700
From:	Khalid Aziz <khalid.aziz@...cle.com>
To:	Oleg Nesterov <oleg@...hat.com>, Andi Kleen <andi@...stfloor.org>
CC:	Thomas Gleixner <tglx@...utronix.de>,
	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
	"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>,
	peterz@...radead.org, akpm@...ux-foundation.org,
	viro@...iv.linux.org.uk, linux-kernel@...r.kernel.org
Subject: Re: [RFC] [PATCH] Pre-emption control for userspace

On 03/05/2014 09:36 AM, Oleg Nesterov wrote:
> On 03/05, Andi Kleen wrote:
>>
>> On Wed, Mar 05, 2014 at 03:54:20PM +0100, Oleg Nesterov wrote:
>>> On 03/04, Andi Kleen wrote:
>>>>
>>>> Anything else?
>>>
>>> Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help.
>>> Or perhaps sys_futex() can do this if it knows the owner. Don't ask
>>> me what exactly I mean though ;)
>>
>> You mean yield_to() would extend the time slice?
>>
>> That would be the same as the mmap page, just with a syscall right?
>
> Not the same. Very roughly I meant something like
>
> 	my_lock()
> 	{
> 		if (!TRY_LOCK()) {
> 			yield_to(owner);
> 			LOCK();
> 		}
>
> 		owner = gettid();
> 	}
>
> But once again, I am not sure if this makes any sense.
>
> Oleg.
>

Trouble with that approach is by the time a thread finds out it can not 
acquire the lock because someone else has it, we have already paid the 
price of context switch. What I am trying to do is to avoid that cost. I 
looked into a few other approaches to solving this problem without 
making kernel changes:

- Use PTHREAD_PRIO_PROTECT protocol to boost the priority of thread that 
holds the lock to minimize contention and CPU cycles wasted by other 
threads only to find out someone already has the lock. Problem I ran 
into is the implementation of PTHREAD_PRIO_PROTECT requires another 
system call, sched_setscheduler(), inside the library to boost priority. 
Now I have added the overhead of a new system call which easily 
outweighs any performance gains from removing lock contention. Besides 
databases implement their own spinlocks to maximize performance and thus 
can not use PTHREAD_PRIO_PROTECT in posix threads library.

- I looked into adaptive spinning futex work Darren Hart was working on. 
It looked very promising but I ran into the same problem again. It 
reduces the cost of contention by delaying context switches in cases 
where spinning is quicker but it still does not do anything to reduce 
the cost of context switch for a thread to get the CPU only to find out 
it can not get the lock. This cost again outweighs the 3%-5% benefit we 
are seeing from just not giving up CPU in the middle of critical section.

Makes sense?

--
Khalid
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/