lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTikhoX4hESUzCYjjh6V7_YkwLX7sVQ@mail.gmail.com>
Date:	Wed, 11 May 2011 14:10:56 +0200
From:	Pasti Klarino <pklarino@...il.com>
To:	linux-kernel@...r.kernel.org
Subject: mvsas issues - 2.6.38 - local I/O seems fine, NFS makes it have
 serious issues

Hi there,

for a while I've been following the issues with the mvsas controllers,
especially in combination with SATA drives. Since version 2.6.36 I
thought it was quite stable, the local processes at least didn't throw
off the 8 disk raid-6 array we have and I didn't really see any errors
at all.

Even on the 2.6.38 we run now (ubuntu 11.04 64 bit server) it was
stable. Emphasis on the was... all was fine until I started exporting
the volume through samba and above all, nfs. There's some vmdk's on
the volume now, which get put there by a backup application over
samba. This works reasonably (could go faster... :)), but then vmware
esx will attach to it over NFS and that's when the real misery starts.
It's completely unusable over NFS, errors aren't more than a couple of
minutes apart when ESX generates traffic and due to bus resets etc.
this means there is hardly any data coming from it whatsoever.

The errors we see are below. This box doesn't really run production
yet (well not with the disks on the marvell controller anyways), but
it will have to in a couple of weeks. In the mean time however I can
test/gather whatever anyone might need.

Kind regards,

================ mdadm details on /dev/md4 ========================

Other RAID sets not posted as they're attached to local SATA
controllers. The data they house must be stable so we moved them off
the marvell(ous misery) controller

root@...avault:/var/log# mdadm --detail /dev/md4
/dev/md4:
        Version : 0.90
  Creation Time : Wed Dec  1 13:43:11 2010
     Raid Level : raid6
     Array Size : 11721086976 (11178.10 GiB 12002.39 GB)
  Used Dev Size : 1953514496 (1863.02 GiB 2000.40 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 4
    Persistence : Superblock is persistent

    Update Time : Wed May 11 11:22:25 2011
          State : active
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 2cdc8fe9:e24fb941:c4e8ec2d:2a80eff0 (local to host datavault)
         Events : 0.45

    Number   Major   Minor   RaidDevice State
       0       8       96        0      active sync   /dev/sdg
       1       8      160        1      active sync   /dev/sdk
       2       8      128        2      active sync   /dev/sdi
       3       8      176        3      active sync   /dev/sdl
       4       8      112        4      active sync   /dev/sdh
       5       8      144        5      active sync   /dev/sdj
       6       8       80        6      active sync   /dev/sdf
       7       8       64        7      active sync   /dev/sde


================= lspci =================
root@...avault:/var/log# lspci
00:00.0 Host bridge: Intel Corporation Core Processor DRAM Controller (rev 12)
00:01.0 PCI bridge: Intel Corporation Core Processor PCI Express x16
Root Port (rev 12)
00:06.0 PCI bridge: Intel Corporation Core Processor Secondary PCI
Express Root Port (rev 12)
00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset
USB2 Enhanced Host Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI
Express Root Port 1 (rev 05)
00:1c.4 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI
Express Root Port 5 (rev 05)
00:1c.5 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI
Express Root Port 6 (rev 05)
00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset
USB2 Enhanced Host Controller (rev 05)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
00:1f.0 ISA bridge: Intel Corporation 3400 Series Chipset LPC
Interface Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series
Chipset 6 port SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 5 Series/3400 Series Chipset SMBus
Controller (rev 05)
01:00.0 SCSI storage controller: Marvell Technology Group Ltd.
MV64460/64461/64462 System Controller, Revision B (rev 01)
02:00.0 SCSI storage controller: Marvell Technology Group Ltd.
MV64460/64461/64462 System Controller, Revision B (rev 01)
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
05:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
06:03.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW
WPCM450 (rev 0a)


================ uname -a ===================
root@...avault:/var/log# uname -a
Linux datavault 2.6.38-8-server #42-Ubuntu SMP Mon Apr 11 03:49:04 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux


================ piece of log from /var/log/syslog ==========================

May 11 10:57:19 datavault kernel: [1107478.903123]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1703:<7>mv_abort_task() mvi=ffff880132dc0000 task=ffff88008af32d80
slot=ffff880132de4538 slot_idx=x1
May 11 10:57:19 datavault kernel: [1107478.903136]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1632:mvs_query_task:rc= 5
May 11 10:57:19 datavault kernel: [1107478.903207]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 7
ctrl sts=0x89800.
May 11 10:57:19 datavault kernel: [1107478.903212]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 7 irq
sts = 0x1001
May 11 10:57:19 datavault kernel: [1107478.903220]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2111:phy7
Unplug Notice
May 11 10:57:19 datavault kernel: [1107478.913277]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 7
ctrl sts=0x199800.
May 11 10:57:19 datavault kernel: [1107478.913279]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 7 irq
sts = 0x1081
May 11 10:57:19 datavault kernel: [1107478.917506]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 7
ctrl sts=0x199800.
May 11 10:57:19 datavault kernel: [1107478.917511]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 7 irq
sts = 0x10000
May 11 10:57:19 datavault kernel: [1107478.917516]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2138:notify
plug in on phy[7]
May 11 10:57:19 datavault kernel: [1107479.027492]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 1224:port 7
attach dev info is 1
May 11 10:57:19 datavault kernel: [1107479.027495]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 1226:port 7
attach sas addr is 7
May 11 10:57:19 datavault kernel: [1107479.027500]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 378:phy 7 byte
dmaded.
May 11 10:57:21 datavault kernel: [1107481.122546]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1586:mvs_I_T_nexus_reset for device[3]:rc= 0
May 11 10:57:21 datavault kernel: [1107481.122564] ata14: translated
ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
May 11 10:57:21 datavault kernel: [1107481.207659] ata14: status=0x01 { Error }
May 11 10:57:21 datavault kernel: [1107481.207664] ata14: error=0x04 {
DriveStatusError }
May 11 10:59:47 datavault kernel: [1107626.944438]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1703:<7>mv_abort_task() mvi=ffff88012f100000 task=ffff880097f95c00
slot=ffff88012f124590 slot_idx=x2
May 11 10:59:47 datavault kernel: [1107626.944450]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1632:mvs_query_task:rc= 5
May 11 10:59:47 datavault kernel: [1107626.944483]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 0
ctrl sts=0x89800.
May 11 10:59:47 datavault kernel: [1107626.944486]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq
sts = 0x1001
May 11 10:59:47 datavault kernel: [1107626.944494]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2111:phy0
Unplug Notice
May 11 10:59:47 datavault kernel: [1107626.954513]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 0
ctrl sts=0x199800.
May 11 10:59:47 datavault kernel: [1107626.954516]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq
sts = 0x1081
May 11 10:59:47 datavault kernel: [1107626.958066]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 0
ctrl sts=0x199800.
May 11 10:59:47 datavault kernel: [1107626.958070]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq
sts = 0x10000
May 11 10:59:47 datavault kernel: [1107626.958074]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2138:notify
plug in on phy[0]
May 11 10:59:47 datavault kernel: [1107627.068075]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 1224:port 0
attach dev info is 0
May 11 10:59:47 datavault kernel: [1107627.068080]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 1226:port 0
attach sas addr is 0
May 11 10:59:47 datavault kernel: [1107627.068093]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 378:phy 0 byte
dmaded.
May 11 10:59:49 datavault kernel: [1107629.163848]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1586:mvs_I_T_nexus_reset for device[0]:rc= 0
May 11 10:59:49 datavault kernel: [1107629.163858] ata7: translated
ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
May 11 10:59:49 datavault kernel: [1107629.163860] ata7.00: device
reported invalid CHS sector 0
May 11 10:59:49 datavault kernel: [1107629.163861] ata7: status=0x01 { Error }
May 11 10:59:49 datavault kernel: [1107629.163863] ata7: error=0x04 {
DriveStatusError }
May 11 11:02:39 datavault kernel: [1107798.819642]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1703:<7>mv_abort_task() mvi=ffff880132dc0000 task=ffff88000c2c8540
slot=ffff880132de4590 slot_idx=x2
May 11 11:02:39 datavault kernel: [1107798.819652]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1632:mvs_query_task:rc= 5
May 11 11:02:39 datavault kernel: [1107798.819674]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 7
ctrl sts=0x89800.
May 11 11:02:39 datavault kernel: [1107798.819678]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 7 irq
sts = 0x1001
May 11 11:02:39 datavault kernel: [1107798.819684]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2111:phy7
Unplug Notice
May 11 11:02:39 datavault kernel: [1107798.829759]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 7
ctrl sts=0x199800.
May 11 11:02:39 datavault kernel: [1107798.829765]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 7 irq
sts = 0x1081
May 11 11:02:39 datavault kernel: [1107798.834168]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 7
ctrl sts=0x199800.
May 11 11:02:39 datavault kernel: [1107798.834172]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 7 irq
sts = 0x10000
May 11 11:02:39 datavault kernel: [1107798.834177]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2138:notify
plug in on phy[7]
May 11 11:02:39 datavault kernel: [1107798.944173]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 1224:port 7
attach dev info is 1
May 11 11:02:39 datavault kernel: [1107798.944176]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 1226:port 7
attach sas addr is 7
May 11 11:02:39 datavault kernel: [1107798.944181]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 378:phy 7 byte
dmaded.
May 11 11:02:41 datavault kernel: [1107801.039070]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1586:mvs_I_T_nexus_reset for device[3]:rc= 0
May 11 11:02:41 datavault kernel: [1107801.039083] ata14: translated
ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
May 11 11:02:41 datavault kernel: [1107801.123629] ata14.00: device
reported invalid CHS sector 0
May 11 11:02:41 datavault kernel: [1107801.123632] ata14: status=0x01 { Error }
May 11 11:02:41 datavault kernel: [1107801.123636] ata14: error=0x04 {
DriveStatusError }
May 11 11:05:07 datavault kernel: [1107946.780999]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1703:<7>mv_abort_task() mvi=ffff88012f100000 task=ffff88012299a4c0
slot=ffff88012f124538 slot_idx=x1
May 11 11:05:07 datavault kernel: [1107946.781014]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1632:mvs_query_task:rc= 5
May 11 11:05:07 datavault kernel: [1107946.781067]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 3
ctrl sts=0x89800.
May 11 11:05:07 datavault kernel: [1107946.781073]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq
sts = 0x1001
May 11 11:05:07 datavault kernel: [1107946.781082]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2111:phy3
Unplug Notice
May 11 11:05:07 datavault kernel: [1107946.791102]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 3
ctrl sts=0x199800.
May 11 11:05:07 datavault kernel: [1107946.791108]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq
sts = 0x1081
May 11 11:05:07 datavault kernel: [1107946.794890]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2083:port 3
ctrl sts=0x199800.
May 11 11:05:07 datavault kernel: [1107946.794895]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq
sts = 0x10000
May 11 11:05:07 datavault kernel: [1107946.794899]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 2138:notify
plug in on phy[3]
May 11 11:05:07 datavault kernel: [1107946.904897]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 1224:port 3
attach dev info is 8000000
May 11 11:05:07 datavault kernel: [1107946.904902]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 1226:port 3
attach sas addr is 3
May 11 11:05:07 datavault kernel: [1107946.904911]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c 378:phy 3 byte
dmaded.
May 11 11:05:09 datavault kernel: [1107949.000397]
/build/buildd/linux-2.6.38/drivers/scsi/mvsas/mv_sas.c
1586:mvs_I_T_nexus_reset for device[3]:rc= 0
May 11 11:05:09 datavault kernel: [1107949.000410] ata10: translated
ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
May 11 11:05:09 datavault kernel: [1107949.081915] ata10.00: device
reported invalid CHS sector 0
May 11 11:05:09 datavault kernel: [1107949.081919] ata10: status=0x01 { Error }
May 11 11:05:09 datavault kernel: [1107949.081922] ata10: error=0x04 {
DriveStatusError }


============= misc of possible interest ==========

root@...avault:/var/log# cat /etc/exports
# /etc/exports: the access control list for filesystems which may be exported
#		to NFS clients.  See exports(5).
#
# Example for NFSv2 and NFSv3:
# /srv/homes       hostname1(rw,sync,no_subtree_check)
hostname2(ro,sync,no_subtree_check)
#
# Example for NFSv4:
# /srv/nfs4        gss/krb5i(rw,sync,fsid=0,crossmnt,no_subtree_check)
# /srv/nfs4/homes  gss/krb5i(rw,sync,no_subtree_check)
#
/mnt/export		*(anonuid=65534,anongid=65534,sync,no_root_squash,rw)



root@...avault:/var/log# mount
/dev/md1 on / type ext4 (rw,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
none on /sys type sysfs (rw,noexec,nosuid,nodev)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
none on /dev type devtmpfs (rw,mode=0755)
none on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
none on /dev/shm type tmpfs (rw,nosuid,nodev)
none on /var/run type tmpfs (rw,nosuid,mode=0755)
none on /var/lock type tmpfs (rw,noexec,nosuid,nodev)
/dev/md0 on /boot type ext2 (rw)
/dev/md3 on /mnt/bhome2 type ext4 (rw)
/dev/md2 on /mnt/bhome1 type ext4 (rw)
/dev/md4 on /mnt/export type ext4 (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)



root@...avault:/var/log# nfsstat
Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
8845295    0          0          0          0

Server nfs v2:
null         getattr      setattr      root         lookup       readlink
14      100% 0         0% 0         0% 0         0% 0         0% 0         0%
read         wrcache      write        create       remove       rename
0         0% 0         0% 0         0% 0         0% 0         0% 0         0%
link         symlink      mkdir        rmdir        readdir      fsstat
0         0% 0         0% 0         0% 0         0% 0         0% 0         0%

Server nfs v3:
null         getattr      setattr      lookup       access       readlink
4         0% 11650     0% 536       0% 472178    5% 37691     0% 0         0%
read         write        create       mkdir        symlink      mknod
2603621  29% 5713221  64% 147       0% 1         0% 0         0% 0         0%
remove       rmdir        rename       link         readdir      readdirplus
123       0% 2         0% 32        0% 0         0% 13        0% 2346      0%
fsstat       fsinfo       pathconf     commit
2718      0% 2         0% 1         0% 1028      0%
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ