linux-kernel - Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160713203944.GC29670@mtj.duckdns.org>
Date:	Wed, 13 Jul 2016 16:39:44 -0400
From:	Tejun Heo <tj@...nel.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	John Stultz <john.stultz@...aro.org>,
	Ingo Molnar <mingo@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>,
	Dmitry Shmidt <dimitrysh@...gle.com>,
	Rom Lemarchand <romlem@...gle.com>,
	Colin Cross <ccross@...gle.com>, Todd Kjos <tkjos@...gle.com>,
	Oleg Nesterov <oleg@...hat.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup
 locking changes

Hello,

On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
> > So, it's a percpu rwsem issue then.  I haven't really followed the
> > perpcpu rwsem changes closely.  Oleg, are multi-milisec delay expected
> > on down write expected with the current implementation of
> > percpu_rwsem?
> 
> There is a synchronize_sched() in there, so sorta. That thing is heavily
> geared towards readers, as is the only 'sane' choice for global locks.

It used to use the expedited variant until 001dac627ff3
("locking/percpu-rwsem: Make use of the rcu_sync infrastructure"), so
it might have been okay before then.

Skewing towards readers is fine but tens of millisecs of delays
definitely can't fit some use cases.  There's a balance between CPU
overhead and latency here.  If down writes are infrequent enough, it
doesn't make sense to aggressively trade off latency for lower
processing overhead for some use cases.

The options that I can see are

1. Somehow make percpu_rwsem's write behavior more responsive in a way
   which is acceptable all use cases.  This would be great but
   probably impossible.

2. Add a fast-writer option to percpu_rwsem so that users which care
   about write latency can opt in for higher processing overhead for
   lower latency.

3. Implement a custom per-cpu locking construct for the particular use
   case.

#3 would inherently be similar to #2 in its behavior.  If #1 isn't
possible, #2 looks like the best course of action.

Thanks.

-- 
tejun