lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 2 May 2013 12:35:08 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Alex Shi <alex.shi@...el.com>
Cc:	mingo@...hat.com, tglx@...utronix.de, akpm@...ux-foundation.org,
	arjan@...ux.intel.com, bp@...en8.de, pjt@...gle.com,
	namhyung@...nel.org, efault@....de, morten.rasmussen@....com,
	vincent.guittot@...aro.org, gregkh@...uxfoundation.org,
	preeti@...ux.vnet.ibm.com, viresh.kumar@...aro.org,
	linux-kernel@...r.kernel.org, len.brown@...el.com,
	rafael.j.wysocki@...el.com, jkosina@...e.cz,
	clark.williams@...il.com, tony.luck@...el.com,
	keescook@...omium.org, mgorman@...e.de, riel@...hat.com
Subject: Re: [PATCH v4 0/6] sched: use runnable load based balance

On Thu, May 02, 2013 at 08:38:17AM +0800, Alex Shi wrote:

> On 3.8 kernel, the first problem commit is 5a505085f043 ("mm/rmap:
> Convert the struct anon_vma::mutex to an rwsem"), It cause much
> imbalance among cpus on aim7 benchmark. The reason is here,
> https://lkml.org/lkml/2013/1/29/84.

Hehe. yeah that one was going to be obvious.. rwsems were known to be totally
bad performers. Not only were they lacking lock stealing but also the spinning.
 
> But, after we use rwsem lock stealing in 3.9 kernel, commit
> ce6711f3d196f09ca0e, aim7 wakeup won't has clear imbalance issue. And
> then aim7 won't need this extra burst wakeup detection.

OK, that seems like a nice fix for rwsems.. one nit:

+               raw_spin_lock_irq(&sem->wait_lock);
+               /* Try to get the writer sem, may steal from the head writer: */
+               if (flags == RWSEM_WAITING_FOR_WRITE)
+                       if (try_get_writer_sem(sem, &waiter)) {
+                               raw_spin_unlock_irq(&sem->wait_lock);
+                               return sem;
+                       }
+               raw_spin_unlock_irq(&sem->wait_lock);
                schedule();

That should probably look like:

	preempt_disable();
	raw_spin_unlock_irq();
	preempt_enable_no_resched();
	schedule();

Otherwise you might find a performance regression on PREEMPT=y kernels.

> With PJT's patch, we know in first few seconds from wakeup, task's
> runnable load maybe nearly zero. The nearly zero runnable load increases
> the imbalance among cpus. So there are about extra 5% performance drop
> when use runnable load to do balance on aim7(in testing, aim7 will fork
> 2000 threads, and then wait for a trigger, then run as fast as possible).

OK, so what I was asking after is if you changed the scheduler after PJTs
patches landed to deal with this bulk wakeup. Also while aim7 might no longer
trigger the bad pattern what is to say nothing ever will? In particular
anything using pthread_cond_broadcast() is known to be suspect of bulk wakeups.

Anyway, I'll go try and make sense of some of the actual patches.. :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ