[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <201011141346.44373@Mail-Followup-To>
Date: Sun, 14 Nov 2010 13:46:43 +0100
From: Markus Schulz <msc@...zsystem.de>
To: linux-kernel@...r.kernel.org
Subject: data corruption with stex (Promise HW-Raid) driver and device-mapper
hello lkml,
i've found a bug with the stex driver (v4.6.0000.4 from
kernel.org-2.6.36 and v4.06.1000.07 from promise download page).
for my tests i've used a kernel build to reproduce the error:
# make-kpkg clean
# make-kpkg --revision 20101112 --append-to-version .aesni --initrd --
jobs 8 kernel_image 2>&1 | tee ../build-result.txt
on success it builds successfully a kernel-image (.deb)
on failure it breaks during linking at a random module like:
LD [M] drivers/char/hangcheck-timer.ko
ld: drivers/bluetooth/hci_uart.o: bad reloc symbol index (0xf5365025 >=
0xb6) for offset 0x5308dc146c5e9728 in section `.debug_info'
drivers/bluetooth/hci_uart.o: could not read symbols: Bad value
make[2]: *** [drivers/bluetooth/hci_uart.ko] Fehler 1
make[2]: *** Warte auf noch nicht beendete Prozesse...
make[1]: *** [modules] Fehler 2
make[1]: Leaving directory `/mnt/raid5cryptlvm/linux-2.6.36'
make: *** [debian/stamp/build/kernel] Fehler 2
I've used kernel 2.6.36. The same kernel i'm rebuilding as regression
test.
summary:
Data corruption (==kernel-compile abort) starts when using device-mapper
block layers (lvm/dm-crypt) and ext4/xfs but works with ext3.
in detail:
every FAILED was tried many times and failed every time
every SUCCESS was tried many times (>3, sometimes > 10 [endless build at
night with "loop until error"])
there are no memory errors (memtest86+ runs fine) or thermal problems.
test 1: hw-raid5 + lvm + ext4 -> FAILED
test 2: hw-raid5 + luks-dm-crypt + ext4 -> FAILED
test 3: hw-raid5 + luks-dm-crypt + lvm + ext4 -> FAILED
test 4: hw-raid5 + lvm + xfs -> FAILED
test 5: hw-raid5 + ext4 -> SUCCESS (no device-mapper)
test 6: hw-raid5 + lvm + ext3 -> SUCCESS
test 7: hw-raid5 + luks-dm-crypt + ext3 -> SUCCESS
test 8: ahci-sata-disk + lvm + ext4 -> SUCCESS (no stex driver involved)
test 9: ahci-sata-disk + luks-dm-crypt + lvm + ext4 -> SUCCESS (no stex
driver involved)
the problem only exists on stex-disks when used with a device mapper
(lvm/dm-crypt) and specific filesystems (ext4/xfs but NOT ext3).
with kernel.org-stex driver version i've tried additional runs with
"options msi=1" to use MSI-interrupts instead of io-apic to avoid
sharing interrupt 16 with usbcore. But with same results.
There exists a similar bug report for ubuntu:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/586897
please CC on answers/comments.
regards
Markus
View attachment ".config" of type "text/x-mpsub" (84231 bytes)
Powered by blists - more mailing lists