lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1248697004.7279.31.camel@fnki-nb00130>
Date:	Mon, 27 Jul 2009 14:16:44 +0200
From:	Jens Rosenboom <jens@...one.net>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Sonny Rao <sonnyrao@...ibm.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: futexes: Still infinite loop in get_futex_key() in 2.6.31-rc4

On Mon, 2009-07-27 at 13:31 +0200, Peter Zijlstra wrote:
> On Mon, 2009-07-27 at 10:00 +0200, Jens Rosenboom wrote:
> > We have a problem with infinitely running processes on kernels at least
> > since 2.6.29.4. It happens on a loaded machine after running for a
> > couple of days,
> 
> What kinds of machine, i386? Could you please enable
> CONFIG_FRAME_POINTER, these backtraces are quite mangled.

i686 or AMD dualcore Opteron to be exact. CONFIG_FRAME_POINTER is
enabled, the complete kernel-config is attached, maybe some other
debugging options are needed? But I copied just the part pertaining to
the stuck process, maybe the complete log has the parts you are missing?

> >  that a "ps ax" seems to get stuck in get_futex_key while
> > exiting. Sadly your patch 
> 
> Who's patch, and which patch? 7c8fa4f04ab956076605422d5ed37410893a8a73?
> That was only regarding huge pages.

Yes, that is the one I was talking about and the commit message seemed
to match what I was seeing here.

> The only loop in get_futex_key() appears to be the one around
> get_user_pages_fast(), and I'm not quite sure how that could get stuck
> like this.
> 
> Could it be glibc loops on futex_wake() returning -EFAULT?

How would I be able to check that?

View attachment "config" of type "text/x-mpsub" (71843 bytes)

Download attachment "pstrace1.txt.gz" of type "application/x-gzip" (23657 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ