linux-kernel - Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALAqxLVq53+Sb1t_aGH9ngW5m4FEDeaoidUwhqxzUU=qc_bNBg@mail.gmail.com>
Date:   Wed, 24 Aug 2016 14:16:52 -0700
From:   John Stultz <john.stultz@...aro.org>
To:     Peter Zijlstra <peterz@...radead.org>, Tejun Heo <tj@...nel.org>,
        Oleg Nesterov <oleg@...hat.com>
Cc:     Om Dhyade <odhyade@...eaurora.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Ingo Molnar <mingo@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Dmitry Shmidt <dimitrysh@...gle.com>,
        Rom Lemarchand <romlem@...gle.com>,
        Colin Cross <ccross@...gle.com>, Todd Kjos <tkjos@...gle.com>
Subject: Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce
 global impact

On Fri, Aug 12, 2016 at 6:44 PM, Om Dhyade <odhyade@...eaurora.org> wrote:
> Update from my tests:
> Use-case: Android application launches.
>
> I tested the patches on android N build, i see max latency ~7ms.
> In my tests, the wait is due to: copy_process(fork.c) blocks all threads in
> __cgroup_procs_write including threads which are not part of the forking
> process's thread-group.
>
> Dimtry had provided a hack patch which reverts to per-process rw-sem which
> had max latency of ~2ms.
>
> android user-space binder library does 2 cgroup write operations per
> transaction, apart from the copy_process(fork.c) wait, i see pre-emption in
> _cgroup_procs_write causing waits.

Hey Peter, Tejun, Oleg,
  So while you're tweaks for the percpu-rwsem have greatly helped the
regression folks were seeing (many thanks, by the way), as noted
above, the performance regression with the global lock compared to
earlier kernels is still ~3x slower (though again, much better then
the 80x slower that was seen earlier).

So I was wondering if patches to go back to the per signal_struct
locking would still be considered? Or is the global lock approach the
only way forward?

At a higher level, I'm worried that Android's use of cgroups as a
priority enforcement mechanism is at odds with developers focusing on
it as a container enforcement mechanism, as in the latter its not
common for tasks to change between cgroups, but with the former
priority adjustments are quite common.

thanks
-john