lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160713203944.GC29670@mtj.duckdns.org>
Date:	Wed, 13 Jul 2016 16:39:44 -0400
From:	Tejun Heo <tj@...nel.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	John Stultz <john.stultz@...aro.org>,
	Ingo Molnar <mingo@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>,
	Dmitry Shmidt <dimitrysh@...gle.com>,
	Rom Lemarchand <romlem@...gle.com>,
	Colin Cross <ccross@...gle.com>, Todd Kjos <tkjos@...gle.com>,
	Oleg Nesterov <oleg@...hat.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup
 locking changes

Hello,

On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
> > So, it's a percpu rwsem issue then.  I haven't really followed the
> > perpcpu rwsem changes closely.  Oleg, are multi-milisec delay expected
> > on down write expected with the current implementation of
> > percpu_rwsem?
> 
> There is a synchronize_sched() in there, so sorta. That thing is heavily
> geared towards readers, as is the only 'sane' choice for global locks.

It used to use the expedited variant until 001dac627ff3
("locking/percpu-rwsem: Make use of the rcu_sync infrastructure"), so
it might have been okay before then.

Skewing towards readers is fine but tens of millisecs of delays
definitely can't fit some use cases.  There's a balance between CPU
overhead and latency here.  If down writes are infrequent enough, it
doesn't make sense to aggressively trade off latency for lower
processing overhead for some use cases.

The options that I can see are

1. Somehow make percpu_rwsem's write behavior more responsive in a way
   which is acceptable all use cases.  This would be great but
   probably impossible.

2. Add a fast-writer option to percpu_rwsem so that users which care
   about write latency can opt in for higher processing overhead for
   lower latency.

3. Implement a custom per-cpu locking construct for the particular use
   case.

#3 would inherently be similar to #2 in its behavior.  If #1 isn't
possible, #2 looks like the best course of action.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ