lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1102190028400.2701@localhost6.localdomain6>
Date:	Sat, 19 Feb 2011 01:27:04 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Uwe Kleine-König 
	<u.kleine-koenig@...gutronix.de>
cc:	David Brownell <dbrownell@...rs.sourceforge.net>,
	Grant Likely <grant.likely@...retlab.ca>,
	LKML <linux-kernel@...r.kernel.org>,
	Wolfram Sang <w.sang@...gutronix.de>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: Deadlock in gpiolib

On Fri, 18 Feb 2011, Uwe Kleine-König wrote:

Brilliant move to not CC the ones who explained you in detail why this
lockdep splat triggers and why it can lead to a real deadlock.

<SNIP>
 
> I tried to wrap my head around all that sysfs stuff and the implied
> locking, but I failed.

You did not even think about providing the information about the
already decoded problem and instead you post your findings as
something completely new and unexplainable.

Dude, that sucks and I'm seriously grumpy about that. As you are too
tired, I sat down and retrieved from the IRC logs what avoids people
to twist their brain around that clusterf*ck over and over.

<peterz>  
	 well, if you, like mentioned, assume sysfs_get/put_active() is a lock,
	 then what it did is: sysfs_get_active_lock(); mutex_lock(sysfs_lock);
	 vs mutex_lock(sysfs_lock); sysfs_deactivate();
	 where, sysfs_deactivate() can be considered to also want a ref
	 which gets you AB-BA

<tglx>
	export does
	lock(sysfs_lock) -> create sysfs entries 
	read does -> "read-lock(sysfs-entry)" -> lock(sysfs_lock)
	unexport does
	lock(sysfs_lock) -> remove sysfs entries, which needs to read_lock them 
	and looking at the implementation it can actually deadlock
	assume one reader vs one unexporter
	cat gpio74/value is in progress while on the other side you do echo 74 >unexport
	then the unexport will hang in wait for completion
	while the reader is waiting for sysfs lock in gpio

There is more info in those logs, but i'm too tired to extract it, so
here is my tentative conclusion in clear text:

sysfs_lock in gpio is only useful to serialize export
vs. unexport. The comment above the lock declaration 

/* lock protects against unexport_gpio() being called while
 * sysfs files are active.
 */

is stupid.

The only real concern is a concurrent instantiation of that sysfs
export vs. a teardown. That might be eventually be handled by the
EXPORT bit in the flags of that gpio chip, but I have not had time to
look at it closely.

Taking the gpio:sysfs_lock inside of the particular read/write
functions is pointless. If sysfs is not preventing the teardown of
that entry while access in progress then sysfs needs fixing.

AFAICT the refcounting in sysfs (which leads to that lockdep splat)
does the right thing, so taking gpiolib:sysfs_lock inside those
read/write functions which are associated with an exported gpio is
simply wrong.

I leave the deeper analysis to the gpio crowd and sysfs experts, but I
would be surprised if I'm wrong.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ