lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <50C21406020000A10000DCE2@gwsmtp1.uni-regensburg.de>
Date:	Fri, 07 Dec 2012 16:06:30 +0100
From:	"Ulrich Windl" <Ulrich.Windl@...uni-regensburg.de>
To:	<linux-kernel@...r.kernel.org>
Subject: ext3 corruption in 3.0 kernel (SLES11 SP2 x86_64 (AMD
 Opteron))

Hi!

I thought I'd let you know of two ext3 corruptions found on an ADM Opteron server running SLES11 SP2 (kernel-xen-3.0.42-0.7.3). Corruptions occurred at different times in different files on different machines: Too much to be ignored.

The older one looked like this:
[75548.267404] EXT3-fs error (device dm-0): htree_dirblock_to_tree: bad entry in directory #205978: rec_len % 4 != 0 - offset=4096, inode=2531699, rec_len=41331, name_len=38

And a more recent one looks like this:
kernel: [261958.359401] EXT3-fs error (device dm-0): ext3_add_entry: bad entry in directory #85582: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0

As the nodes are running Xen VMM in a cluster, it's possible that node see Resets at any time (fencing), but I thought a journaling filesystem would either not allow or fix corruption.

In both cases I found this problem when a file could not be created like this RPM error message:
Error: RPM failed: error: unpacking of archive failed on file /lib/modules/3.0.42-0.7-default/kernel/drivers/media/video/cpia2/cpia2.ko;50c1fafd: cpio: open failed - Input/output error

After a reset I had to repair the filesystem manually with these type of errors:
Inode 248552 was part of the orphaned inode list.  FIXED.
Block bitmap differences:
Free blocks count wrong for group

After repair and reboot I still saw:
kernel: [  698.061916] EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 68710
kernel: [  698.061916] EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 68711

(dm-0 is the root Logical Volume)

CPU-Details (Sun X4100 Server) are:
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 33
model name      : Dual Core AMD Opteron(tm) Processor 285
stepping        : 2

(I know this CPU has some bugs with virtualization; is filesystem corruption one of them?)

Regards,
Ulrich


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ