lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <4E1DFB4E.7050107@xsdg.org>
Date:	Wed, 13 Jul 2011 20:08:46 +0000
From:	Omari Stephens <xsdg@...g.org>
To:	linux-kernel@...r.kernel.org
Subject: In-kernel deadlock of some sort with 2.6.39.2

Please CC me on responses, since I'm not on lkml.

### Short version:
Under 2.6.39.2, one of my machines regularly gets into a state where 
processes end up in uninterruptible waits that never end.  One peculiar 
thing that happens is that attempts to stat(1) or read certain files 
from procfs never return.

I am pretty familiar with compiling and running my own kernels, but not 
so familiar with troubleshooting when non-obvious things go wrong.  Any 
suggestions would be appreciated, even if it's "we might've fixed 
something related in version XYZ, try that one"

I've uploaded my config here:
http://web.mit.edu/~xsdg/Public/stuff/kernel/broken_2.6.39.2_config.txt


### Detailed version:
On one of my machines, I recently compiled and installed 2.6.39.2 
alongside a switch from the nv driver to nouveau.  This was specifically 
to solve an issue where FF7 nightly would cause high CPU usage in X just 
by virtue of painting the screen.

The upgrade did fix my X issues, FF7 is as smooth as could be hoped on 
this machine, but now FF periodically (but repeatably, after a reboot) 
stops responding.  According to top, the system is about 94% IO-wait.:
Cpu0  :  3.7%us,  2.4%sy,  0.0%ni,  0.0%id, 93.9%wa,  0.0%hi,  0.0%si, 
0.0%st

Oddly, I noticed that running `ps` would halt uninterruptibly.  After 
some further debugging, I discovered that attempting to stat (not even 
read) certain files in procfs will never return.  For instance:

19:36:38> [xsdg{perl}@...oc/4950]
$find | sort | xargs stat
[...]
   File: `./environ'
   Size: 0         	Blocks: 0          IO Block: 1024   regular empty file
Device: 3h/3d	Inode: 6413606     Links: 1
Access: (0400/-r--------)  Uid: ( 1000/    xsdg)   Gid: ( 1000/    xsdg)
Access: 2011-07-13 19:26:15.829482661 +0000
Modify: 2011-07-13 19:26:15.829482661 +0000
Change: 2011-07-13 19:26:15.829482661 +0000
[sits here indefinitely]

By the magical powers of deduction:
19:36:50> [xsdg{perl}@...oc/4950]
$l exe
[sits here indefinitely]

Oddly, I can stat cmdline with no issues, but if I try to _read_ it, 
then it blocks.  As you might imagine, I have no idea what process 4950 is.
19:56:16> [xsdg{perl}@...oc/4950]
$stat cmdline
   File: `cmdline'
   Size: 0         	Blocks: 0          IO Block: 1024   regular empty file
Device: 3h/3d	Inode: 3553148     Links: 1
Access: (0444/-r--r--r--)  Uid: ( 1000/    xsdg)   Gid: ( 1000/    xsdg)
Access: 2011-07-12 18:13:35.481767937 +0000
Modify: 2011-07-12 18:13:35.481767937 +0000
Change: 2011-07-12 18:13:35.481767937 +0000

19:56:18> [xsdg{perl}@...oc/4950]
$cat cmdline
[sits here indefinitely]

--xsdg
   http://blog.doppler-photo.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ