linux-kernel - Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 14 Jul 2016 14:20:49 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Tejun Heo <tj@...nel.org>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	John Stultz <john.stultz@...aro.org>,
	Ingo Molnar <mingo@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>,
	Dmitry Shmidt <dimitrysh@...gle.com>,
	Rom Lemarchand <romlem@...gle.com>,
	Colin Cross <ccross@...gle.com>, Todd Kjos <tkjos@...gle.com>,
	Oleg Nesterov <oleg@...hat.com>
Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup
 locking changes

On Thu, Jul 14, 2016 at 08:08:45AM -0400, Tejun Heo wrote:
> On Thu, Jul 14, 2016 at 02:04:28PM +0200, Peter Zijlstra wrote:
> > > I think it probably makes sense to make this the default on !RT at
> > > least with a separate patch w/o stable cc'd.  While most use cases
> > > will be fine with the latency on write path, it also means that the
> > > reader side is blocked for the duration which can hurt.  rwsem implies
> > > a lot more readers and thus more read lock operations than writes.
> > > It's weird to trade off higher latency for lower cpu usage when it
> > > would also slow down all readers.
> > 
> > NAK, no expedited muck by default. There's more than just RT that
> > doesn't like IPI sprays.
> 
> Can you elaborate?  

HPC doesn't use RT but still wants to minimize jitter such that all CPUs
complete their work ASAP. They use barriers to wait on the slowest CPU
to complete work.

Sending random interrupts disturbs cache and other stuff and delays
things unnecessarily.

Same with RDMA (or other) userspace poll loops which want minimal
latency, they too don't use RT, but also very much want to avoid the
kernel poking at them.

Many of this could eventually use NOHZ_FULL, but I'm not sure all of
that is suitable.

In general its very bad form to spray interrupts just because and we've
spend a lot of effort to reduce that.

> If that's the case, we have the wrong implemention
> for percpu-rwsem where very long delays for writers induce the same
> level of delays to all readers.  If expedited by default isn't
> workable, we should move away from rcu_sync for percpu_rwsem.

Just because your usecase doesn't like it, doesn't mean its not good.
Its a perfectly fine implementation for uprobes for example. The
addition/removal of uprobes is extremely rare, as global writers should
be.

And no, the writer delay isn't observed by the readers, those will
continue 'undisturbed' for most of it.