lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 20 Dec 2006 14:55:06 -0600
From:	"Mr. Berkley Shands" <bshands@...gy.com>
To:	linux-kernel@...r.kernel.org
CC:	Dave Lloyd <dlloyd@...gy.com>
Subject: 2.6.19 xfs crash spawns tcp reclen errors off nfs

We have had several really strange NFS issues, reported out of
net/sunrpc/svcsock.c in

static int
svc_tcp_recvfrom(struct svc_rqst *rqstp)

        svsk->sk_reclen &= 0x7fffffff;
        dprintk("svc: TCP record, %d bytes\n", svsk->sk_reclen);
        if (svsk->sk_reclen > serv->sv_bufsz) {
            printk(KERN_NOTICE "RPC: bad TCP reclen 0x%08lx (large)\n",
                   (unsigned long) svsk->sk_reclen);
            printk(KERN_NOTICE "sockaddr_in 0x%04x port 0x%04x max 
0x%08lx\n",
               rqstp->rq_addr.sin_addr.s_addr,
               rqstp->rq_addr.sin_port, (long) serv->sv_bufsz);
            goto err_delete;
        }

The size reported grows, and grows, and...

the server runs 2.6.19 when XFS has its issues (as follows)
2TB partition, NFS exported. All machines are x86_64 AMD 275's with 16GB
and running redhat AS 4.4 (Centos 4.4)

a typical crash under 2.6.19 is

Dec 14 09:13:48 sanandreas kernel: Filesystem "sdb1": XFS internal error 
xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c.  Caller 
0xffffffff881ceaa8
Dec 14 09:13:48 sanandreas kernel:
Dec 14 09:13:48 sanandreas kernel: Call Trace:
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff881c79e9>] 
:xfs:xfs_trans_cancel+0x5f/0xfa
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff881ceaa8>] 
:xfs:xfs_create+0x62e/0x686
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff802f62be>] __up_read+0x10/0x8a
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff881d8730>] 
:xfs:xfs_vn_mknod+0x1a2/0x3c5
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff804440a6>] 
_spin_lock_irqsave+0x9/0xe
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff802f62be>] __up_read+0x10/0x8a
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff881c8413>] 
:xfs:xfs_trans_unlocked_item+0x22/0x3c
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff881cd4c3>] 
:xfs:xfs_access+0x3d/0x46
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff881d8dce>] 
:xfs:xfs_vn_permission+0x11/0x15
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff8027fe9f>] 
permission+0xcd/0xd6
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff80280615>] 
__link_path_walk+0x16c/0xdb7
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff8028d8f1>] 
mntput_no_expire+0x17/0x8f
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff80281322>] 
link_path_walk+0xc2/0xd0
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff80281bde>] 
vfs_create+0xdf/0x14b
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff8028200c>] 
open_namei+0x18e/0x662
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff802789a1>] 
do_filp_open+0x1c/0x3d
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff80278b92>] 
get_unused_fd+0xea/0xf9
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff80278cba>] 
do_sys_open+0x44/0xbe
Dec 14 09:13:48 sanandreas kernel:  [<ffffffff8020979e>] 
system_call+0x7e/0x83
Dec 14 09:13:48 sanandreas kernel:
Dec 14 09:13:48 sanandreas kernel: xfs_force_shutdown(sdb1,0x8) called 
from line 1139 of file fs/xfs/xfs_trans.c.  Return address = 
0xffffffff881c7a07
Dec 14 09:13:48 sanandreas kernel: Filesystem "sdb1": Corruption of 
in-memory data detected.  Shutting down filesystem: sdb1
Dec 14 09:13:48 sanandreas kernel: Please umount the filesystem, and 
rectify the problem(s)


this then causes the heavy write clients to write garbage NFS request that
grow...

Dec 20 14:34:15 newmadrid kernel: sockaddr_in 0x7004130a port 0x5c03 max 
0x00008400
Dec 20 14:34:18 newmadrid kernel: RPC: bad TCP reclen 0x0000fda8 (large)
Dec 20 14:34:18 newmadrid kernel: sockaddr_in 0x7004130a port 0x5c03 max 
0x00008400
Dec 20 14:34:21 newmadrid kernel: RPC: bad TCP reclen 0x0000fda8 (large)
Dec 20 14:34:21 newmadrid kernel: sockaddr_in 0x7004130a port 0x5c03 max 
0x00008400
Dec 20 14:34:24 newmadrid kernel: RPC: bad TCP reclen 0x0000fda8 (large)
Dec 20 14:34:24 newmadrid kernel: sockaddr_in 0x7004130a port 0x5c03 max 
0x00008400
Dec 20 14:34:27 newmadrid kernel: RPC: bad TCP reclen 0x0000fda8 (large)
Dec 20 14:34:27 newmadrid kernel: sockaddr_in 0x7004130a port 0x5c03 max 
0x00008400
Dec 20 14:34:30 newmadrid kernel: RPC: bad TCP reclen 0x0000fda8 (large)
Dec 20 14:34:30 newmadrid kernel: sockaddr_in 0x7004130a port 0x5c03 max 
0x00008400
Dec 20 14:34:33 newmadrid kernel: RPC: bad TCP reclen 0x0000fda8 (large)
Dec 20 14:34:33 newmadrid kernel: sockaddr_in 0x7004130a port 0x5c03 max 
0x00008400

The clients run 2.6.19. the servers crash once a day when the heavy 
simulations start up.
eth0 and eth1 are bonded into BOND0, tyan 2895 or SuperMicro H8DC8 
motherboards.

Attached is a tarball, bzip'ed of the tcpdump traces and the 
/var/log/messages file
bshands@...sh.eng.exegy.net home/bshands/Documents> ls -l reclen.tar.bz2
-rw-r--r--  1 bshands dssi 24379 Dec 20 14:47 reclen.tar.bz2
bshands@...sh.eng.exegy.net home/bshands/Documents> md5sum reclen.tar.bz2
f449a3c83ca79cad04d8a63e6e253604  reclen.tar.bz2
bshands@...sh.eng.exegy.net /bulk/tmp> ls -l messages reclen.tcp*
-rw-r--r--  1 bshands dssi 37005 Dec 20 14:35 messages
-rw-r--r--  1 root    dssi 57946 Dec 20 14:35 reclen.tcp0
-rw-r--r--  1 root    dssi 40678 Dec 20 14:35 reclen.tcp1


Neil Brown indicated that this NFS packet growth should have been fixed in
2.6.18-rc5, but it still happens, easily triggered by the XFS shutdown.
the packet sizes reported can grow to > 1GB.
Dropping back to 2.6.18.1 stops the XFS crashing, and hence stops the 
NFS mess.

berkley






-- 

//E. F. Berkley Shands, MSc//

**Exegy Inc.**

3668 S. Geyer Road, Suite 300

St. Louis, MO  63127

Direct:  (314) 450-5348

Cell:  (314) 303-2546

Office:  (314) 450-5353

Fax:  (314) 450-5354

 

This e-mail and any documents accompanying it may contain legally 
privileged and/or confidential information belonging to Exegy, Inc.  
Such information may be protected from disclosure by law.  The 
information is intended for use by only the addressee.  If you are not 
the intended recipient, you are hereby notified that any disclosure or 
use of the information is strictly prohibited.  If you have received 
this e-mail in error, please immediately contact the sender by e-mail or 
phone regarding instructions for return or destruction and do not use or 
disclose the content to others.

 


Download attachment "reclen.tar.bz2" of type "application/x-bzip2" (24379 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ