lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1406270027300.5170@nanos>
Date:	Fri, 27 Jun 2014 00:35:09 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Austin Schuh <austin@...oton-tech.com>
cc:	Richard Weinberger <richard.weinberger@...il.com>,
	Mike Galbraith <umgwanakikbuti@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	rt-users <linux-rt-users@...r.kernel.org>,
	Steven Rostedt <rostedt@...dmis.org>
Subject: Re: Filesystem lockup with CONFIG_PREEMPT_RT

On Thu, 26 Jun 2014, Austin Schuh wrote:
> On Wed, May 21, 2014 at 12:33 AM, Richard Weinberger
> <richard.weinberger@...il.com> wrote:
> > CC'ing RT folks
> >
> > On Wed, May 21, 2014 at 8:23 AM, Austin Schuh <austin@...oton-tech.com> wrote:
> >> On Tue, May 13, 2014 at 7:29 PM, Austin Schuh <austin@...oton-tech.com> wrote:
> >>> Hi,
> >>>
> >>> I am observing a filesystem lockup with XFS on a CONFIG_PREEMPT_RT
> >>> patched kernel.  I have currently only triggered it using dpkg.  Dave
> >>> Chinner on the XFS mailing list suggested that it was a rt-kernel
> >>> workqueue issue as opposed to a XFS problem after looking at the
> >>> kernel messages.
> 
> I've got a 100% reproducible test case that doesn't involve a
> filesystem.  I wrote a module that triggers the bug when the device is
> written to, making it easy to enable tracing during the event and
> capture everything.
> 
> It looks like rw_semaphores don't trigger wq_worker_sleeping to run
> when work goes to sleep on a rw_semaphore.  This only happens with the
> RT patches, not with the mainline kernel.  I'm foreseeing a second
> deadlock/bug coming into play shortly.  If a task holding the work
> pool spinlock gets preempted, and we need to schedule more work from
> another worker thread which was just blocked by a mutex, we'll then
> end up trying to go to sleep on 2 locks at once.

I remember vaguely, that I've seen and analyzed that quite some time
ago. I can't page in all the gory details right now, but I have a look
how the related code changed in the last couple of years tomorrow
morning with an awake brain.

Steven, you did some analysis on that IIRC, or was that just related
to rw_locks?

Thanks,

	tglx




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ