linux-kernel - Re: [PATCH -tip v2 1/2] locking/rtmutex: Support spin on owner

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1435796198.25652.121.camel@stgolabs.net>
Date:	Wed, 01 Jul 2015 17:16:38 -0700
From:	Davidlohr Bueso <dave@...olabs.net>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Darren Hart <dvhart@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mike Galbraith <umgwanakikbuti@...il.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH -tip v2 1/2] locking/rtmutex: Support spin on owner

On Thu, 2015-07-02 at 00:27 +0200, Thomas Gleixner wrote:
> On Wed, 1 Jul 2015, Davidlohr Bueso wrote:
> 
> > Similar to what we have in other locks, particularly regular mutexes, the
> > idea is that as long as the owner is running, there is a fair chance it'll
> > release the lock soon, and thus a task trying to acquire the rtmutex will
> > better off spinning instead of blocking immediately after the fastpath.
> > Conditions to stop spinning and enter the slowpath are simple:
> > 
> > (1) Upon need_resched()
> > (2) Current lock owner blocks
> >  
> > Because rtmutexes track the lock owner atomically, we can extend the fastpath
> > to continue polling on the lock owner via cmpxchg(lock->owner, NULL, current).
> > 
> > However, this is a conservative approach, such that if there are any waiters
> > in-line, we stop spinning and immediately take the traditional slowpath. This
> > allows priority boosting to take precedence over spinning, as otherwise we
> > could starve a higher priority queued-up task (ie: top waiter) if spinners
> > constantly steal the lock.
> 
> I'm a bit wary about the whole approach. In the RT tree we spin AFTER
> we've enqueued the waiter and run priority boosting. While I can see
> the charm of your approach, i.e. avoiding the prio boost dance for the
> simple case, this can introduce larger latencies.
> 
> T1 (prio = 0)  	      T2 (prio = 50)
>  lock(RTM);
> 		      lock(RTM);
> 		       spin()
> -->preemption	        
> T3 (prio = 10) 		leave spin, because owner is not on cpu
>    	   	       
> 		       enqueue();
> 		       boost();
> 		       schedule();
> -->preemption
> T1 (prio = 50)
> 
> So we trade two extra context switches in the worst case for an
> enhancement of performance in the normal case. I cannot quantify the
> impact of this, but we really need to evaluate that proper before
> going there.

This is a very good point.

My first thought is that for a general purpose OS, the extra latency is
probably ok -- if you _really_ rely on rt characteristics enough to care
about this, you should be using the rt-patchset to begin with, methinks.
Aside from the non-blocking performance benefits of spinning, avoiding
the wait_lock (and the pi_lock in the case of doing the spinning AFTER
the boosting) can ease a lot of the lock contention.

> Aside of that, if the lock is really contended, then you force all
> spinners off the cpu, if one of the spinners starts blocking simply
> because you have no idea which one is the top prio spinner.
> 
> T1 (prio = 0)  	      T2 (prio = 50)  	  T3 (prio = 10)
>  lock(RTM);
> 		      lock(RTM);	  lock(RTM);
> 		       spin()		   spin();
> 		               		  --> preemption
> 					   enqueue()
> 					   boost();
> 					   schedule();
> 		       sees waiter bit
> 		       enqueue();
> 		       boost();
> 		       schedule();
> 
> T2 could happily keep spinning despite T3 going to sleep. I'm not sure
> if that's what we want to achieve.

Yeah this is one of the reasons why I was tempted of checking the
top-waiter prio against current prio to possibly keep spinning. Now
doing so without holding the wait_lock is obviously racy, however safe
afaict. If the top-waiter changes after it is checked by the spinner,
and we get it wrong, at worst we send that thread falsely to sleep,
otherwise we bogus spin once more -- neither of which is the end of the
world. Ie:

[top waiter prio 10]
T1 (prio 0)						T2		T3 (prio 0)
lock(RTM)
 if (rt_mutex_has_waiters(lock) &&					lock(RTM)
     rt_mutex_top_waiter()->prio > current->prio){			spin
	   ...						[release lock]
							[wakeup top waiter]

									[adds itself to the tree]
     }

But I could be overlooking something, which is why I chose to exclude it
in this patch.

Thanks,
Davidlohr


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/