lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B13FBE7.9040304@krogh.cc>
Date:	Mon, 30 Nov 2009 18:07:51 +0100
From:	Jesper Krogh <jesper@...gh.cc>
To:	linux-kernel@...r.kernel.org, linux-nfs@...r.kernel.org
Subject: 2.6.31.6, unresponsiveness and something with nfs

Hi.

I have a system running 2.6.31.6 that when running a particular process
become "unresponsive". I cannot really tell what it is but the effect is
that logins as ordinary users hangs, when that user has its home on a
remote NFS-server.

so from root "su - localuser" works excellent. But su - user-with-home
on-nfs doesnt.

It is not as if NIS/NFS doesnt work, since i can get a directory-listing
from the NFS-share as root without problems.

But here is the last 10 lines from "strace -f su -
user-with-home-on-nfs" .. it get into an un-interruptible hang.

[pid 24599] close(3)                    = 0
[pid 24599] open("/etc/localtime", O_RDONLY) = 3
[pid 24599] fstat(3, {st_mode=S_IFREG|0644, st_size=2134, ...}) = 0
[pid 24599] fstat(3, {st_mode=S_IFREG|0644, st_size=2134, ...}) = 0
[pid 24599] mmap(NULL, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6b0c5b2000
[pid 24599] read(3,
"TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\6\0\0\0\6\0\0"..., 4096) = 2134
[pid 24599] lseek(3, -1368, SEEK_CUR)   = 766
[pid 24599] read(3,
"TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\0\0\0\10\0"..., 4096) = 1368
[pid 24599] close(3)                    = 0
[pid 24599] munmap(0x7f6b0c5b2000, 4096) = 0
[pid 24599] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2134,
...}) = 0
[pid 24599] fstat(1,

^C^C^C^C


or at least not uninterruptable, because I have a process merging 20,
1.5GB presorted files using "sort -m" from GNU-coreutils.. on an ext4
volume, a few seconds after I kill -9 the sorting process.. all hanging
login continues.. the above process continues(and the system returns to
"normal state"):

{st_mode=S_IFREG|0664, st_size=246138, ...}) = 0
[pid 24599] --- SIGINT (Interrupt) @ 0 (0) ---
Process 24542 resumed
Process 24599 detached
[pid 24542] <... wait4 resumed> 0x7fffa656c5a4, 0, NULL) = ? ERESTARTSYS
(To be restarted)
[pid 24542] --- SIGINT (Interrupt) @ 0 (0) ---

The merging process is on an ext4 volume of 8TB in size. strace of the
sorting process, shows it progresses nicely.

The system is running 2.6.31.6 with
59a252ff8c0f2fa32c896f69d56ae33e641ce7ad reverted as suggested by J.
Bruce Fields, to me it seems unrelated.

Jesper
-- 
Jesper
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ