lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMbhsRSi4ZxBVokhZJfPotJhTjDD_SFK8BYPr7sb5J-4UE400g@mail.gmail.com>
Date:	Wed, 13 Jul 2016 13:44:50 -0700
From:	Colin Cross <ccross@...gle.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	John Stultz <john.stultz@...aro.org>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Dmitry Shmidt <dimitrysh@...gle.com>,
	Rom Lemarchand <romlem@...gle.com>,
	Todd Kjos <tkjos@...gle.com>, Oleg Nesterov <oleg@...hat.com>
Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup
 locking changes

On Wed, Jul 13, 2016 at 11:21 AM, Tejun Heo <tj@...nel.org> wrote:
> (cc'ing Oleg)
>
> Hello,
>
> On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote:
>>   So Dmitry Shmidt recently noticed that with 4.4 based systems we're
>> seeing quite a bit of performance overhead from
>> __cgroup_procs_write().
>>
>> With 4.4 tree as it stands, we're seeing __cgroup_procs_write() quite
>> often take 10s of miliseconds to execute (with max times up in the
>> 80ms range).
>
> Yikes, that's pretty high.  Does this happen only while the system is
> generally busy or regardless of overall load?
>
>> While with 4.1 it was quite often in the single usec range, and max
>> time values still in in sub-milisecond range.
>>
>> The majority of these performance regressions seem to come from the
>> locking changes in:
>>
>> 3014dde762f6 ("cgroup: simplify threadgroup locking")
>> and
>> 1ed1328792ff  ("sched, cgroup: replace signal_struct->group_rwsem with
>> a global percpu_rwsem")
>>
>> Dmitry has found that by reverting these two changes (which don't
>> revert easiliy), we can get back down to tens 10-100 usec range for
>> most calls, with max values occasionally spiking to ~18ms.
>>
>> Those two commits do talk about performance regressions, that were
>> supposedly alleviated by percpu_rwsem changes, but I'm not sure we are
>> seeing this.
>>
>> In 1ed1328792ff, the commit talks about the write path being a fairly
>> cold path, but with Android I worry this may not actually be the case,
>> as Android uses cpuset cgroups to group tasks into foreground and
>> background tasks, but this means when switching applications, tasks
>> are migrated between cgroups. Putting an additional 80 milisecond
>> delay on this adds potentially visible latencies on task switching.
>
> Switching between foreground and background isn't a hot path.  It's
> human initiated operations after all.  It taking 80 msecs sure is
> problematic but I'm skeptical that this is from actual contention
> given that the only reader side holders are fork and exit paths.

Slight correction here, we move tasks to the foreground cgroup and
back around binder IPC calls from foreground processes to background
processes, so it is significantly hotter than just human initiated
operations.

>> Reverting those two changes in the Android common.git tree doesn't
>> feel like a good long term solution here, so I was wondering if you
>> had any thoughts on how to further reduce the performance regression
>> here?
>
> One interesting thing to try would be replacing it with a regular
> non-percpu rwsem and see how it behaves.  That should easily tell us
> whether this is from actual contention or artifacts from percpu_rwsem
> implementation.
>
> Thanks.
>
> --
> tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ