lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 24 Aug 2016 14:16:52 -0700
From:   John Stultz <john.stultz@...aro.org>
To:     Peter Zijlstra <peterz@...radead.org>, Tejun Heo <tj@...nel.org>,
        Oleg Nesterov <oleg@...hat.com>
Cc:     Om Dhyade <odhyade@...eaurora.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Ingo Molnar <mingo@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Dmitry Shmidt <dimitrysh@...gle.com>,
        Rom Lemarchand <romlem@...gle.com>,
        Colin Cross <ccross@...gle.com>, Todd Kjos <tkjos@...gle.com>
Subject: Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce
 global impact

On Fri, Aug 12, 2016 at 6:44 PM, Om Dhyade <odhyade@...eaurora.org> wrote:
> Update from my tests:
> Use-case: Android application launches.
>
> I tested the patches on android N build, i see max latency ~7ms.
> In my tests, the wait is due to: copy_process(fork.c) blocks all threads in
> __cgroup_procs_write including threads which are not part of the forking
> process's thread-group.
>
> Dimtry had provided a hack patch which reverts to per-process rw-sem which
> had max latency of ~2ms.
>
> android user-space binder library does 2 cgroup write operations per
> transaction, apart from the copy_process(fork.c) wait, i see pre-emption in
> _cgroup_procs_write causing waits.


Hey Peter, Tejun, Oleg,
  So while you're tweaks for the percpu-rwsem have greatly helped the
regression folks were seeing (many thanks, by the way), as noted
above, the performance regression with the global lock compared to
earlier kernels is still ~3x slower (though again, much better then
the 80x slower that was seen earlier).

So I was wondering if patches to go back to the per signal_struct
locking would still be considered? Or is the global lock approach the
only way forward?

At a higher level, I'm worried that Android's use of cgroups as a
priority enforcement mechanism is at odds with developers focusing on
it as a container enforcement mechanism, as in the latter its not
common for tasks to change between cgroups, but with the former
priority adjustments are quite common.

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ