lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 14 Nov 2010 13:46:43 +0100
From:	Markus Schulz <msc@...zsystem.de>
To:	linux-kernel@...r.kernel.org
Subject: data corruption with stex (Promise HW-Raid) driver and device-mapper

hello lkml,

i've found a bug with the stex driver (v4.6.0000.4 from 
kernel.org-2.6.36 and v4.06.1000.07 from promise download page).

for my tests i've used a kernel build to reproduce the error:

# make-kpkg clean
# make-kpkg --revision 20101112 --append-to-version .aesni --initrd --
jobs 8 kernel_image 2>&1 | tee ../build-result.txt

on success it builds successfully a kernel-image (.deb) 
on failure it breaks during linking at a random module like:

LD [M]  drivers/char/hangcheck-timer.ko
ld: drivers/bluetooth/hci_uart.o: bad reloc symbol index (0xf5365025 >= 
0xb6) for offset 0x5308dc146c5e9728 in section `.debug_info'
drivers/bluetooth/hci_uart.o: could not read symbols: Bad value
make[2]: *** [drivers/bluetooth/hci_uart.ko] Fehler 1
make[2]: *** Warte auf noch nicht beendete Prozesse...
make[1]: *** [modules] Fehler 2
make[1]: Leaving directory `/mnt/raid5cryptlvm/linux-2.6.36'
make: *** [debian/stamp/build/kernel] Fehler 2

I've used kernel 2.6.36. The same kernel i'm rebuilding as regression 
test.

summary:
Data corruption (==kernel-compile abort) starts when using device-mapper 
block layers (lvm/dm-crypt) and ext4/xfs but works with ext3.

in detail:
every FAILED was tried many times and failed every time
every SUCCESS was tried many times (>3, sometimes > 10 [endless build at 
night with "loop until error"])
there are no memory errors (memtest86+ runs fine) or thermal problems.

test 1: hw-raid5 + lvm + ext4 -> FAILED 
test 2: hw-raid5 + luks-dm-crypt + ext4 -> FAILED
test 3: hw-raid5 + luks-dm-crypt + lvm + ext4 -> FAILED
test 4: hw-raid5 + lvm + xfs -> FAILED

test 5: hw-raid5 + ext4 -> SUCCESS (no device-mapper)
test 6: hw-raid5 + lvm + ext3 -> SUCCESS 
test 7: hw-raid5 + luks-dm-crypt + ext3 -> SUCCESS
test 8: ahci-sata-disk + lvm + ext4 -> SUCCESS (no stex driver involved)
test 9: ahci-sata-disk + luks-dm-crypt + lvm + ext4 -> SUCCESS (no stex 
driver involved)

the problem only exists on stex-disks when used with a device mapper 
(lvm/dm-crypt) and specific filesystems (ext4/xfs but NOT ext3).

with kernel.org-stex driver version i've tried additional runs with 
"options msi=1" to use MSI-interrupts instead of io-apic to avoid 
sharing interrupt 16 with usbcore. But with same results.

There exists a similar bug report for ubuntu: 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/586897

please CC on answers/comments.

regards 
Markus

View attachment ".config" of type "text/x-mpsub" (84231 bytes)

Powered by blists - more mailing lists