lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=GYpbHjcsuQSnNPL4SCv-r=Me6oiH7dJ98a64udbakWLaUjQ@mail.gmail.com>
Date:	Wed, 16 Mar 2016 16:22:17 -0700
From:	Joel Fernandes <agnel.joel@...il.com>
To:	linux-rt-users@...r.kernel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	kernelnewbies <kernelnewbies@...linux.org>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	Ingo Molnar <mingo@...hat.com>,
	Greg Kroah-Hartman <greg@...ah.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: RFC on fixing mutex spinning on owner

Hi,

On a fairly recent kernel and android userspace, I am seeing that with
i915 driver is in a spin loop waiting for mutex owner to release it
(mutex_spin_on_owner). I believe this because the owner of the mutex
is running on another CPU and the expectation is the mutex owner
releases the mutex or goes to sleep soon, so we avoid sleeping if we
fail to acquire mutex and continue to spin and try to acquire it much
like a spinlock (while disabling preemption through out the spinning).

My question is, what if the owner cannot or doesn't want to sleep and
holds the mutex runs for a while while holding it. (Lets also assume
that all other tasks are sleeping on the mutex owner's CPU so its not
preempted).

In this case, does it make sense to time out the spinning after a
while? Because preemption is disabled during the spinning so this
spinning business seems a very very bad thing.

Should the code holding the mutex and running (the owner) be fixed to
not hold mutex for a while? Or would a patch introducing a timeout of
a certain threshold on the spinning be welcomed?

To give numbers, I am seeing spinning of as long as 20 ms in the worst
case, while the mutex owner holds the mutex for 22 ms. The ftrace
preemptoff tracer goes off.

Thanks for any advice on what the right fix of the problem should be.

Best,
Joel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ