lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Sun, 25 Sep 2011 04:31:25 -0700
From:	John Sullivan <sullivan@...o.com>
To:	linux-kernel@...r.kernel.org
CC:	linux-scsi@...r.kernel.org
Subject: mvsas, errors with 88SE9485 in 4x PCI slot.


Hi,

I have a system with a Supermicro X9SCM (Intel C204 chipset) motherboard
and three Supermicro AOC-SAS2LP-MV8 controllers.

The AOC-SAS2LP-MV8 is based on the newer Marvell 88SE9485 controller,
which supports 8 channels of 6Gb/s SAS.  The `9485 is natively an 8x
PCI-E device.  This card is supported only in the latest 3.0/3.1.0-rc
kernels.  I am running 3.1.0-rc7.

The motherboard has 2 x8 PCI-E slots and 2 x4 PCI-E slots.  The `9845
in the x4 PCI-E slot consistently gets storms of errors from the mvsas
driver under heavy I/O, usually resulting in a drive getting kicked.

The cards in the x8 PCI-E slots work fine.  Under light I/O (moving or
copying small files) the x4 card is OK, but starting an md raid check
or rebuild operation or sustained large file copy dies in 2-3 minutes.

I have tried every combination I can think of, removing all but one card 
and then trying each of the 3 cards in both x8 and x4 slots.

There may be some flaw in my experiment, but as far as I can tell,
the only time I get errors is in the x4 slot.

These errors usually take the form of:

> [  360.342793] drivers/scsi/mvsas/mv_sas.c 1904:port 6 slot 24 rx_desc 30018 has error info8000000080000000.
> [  360.342801] drivers/scsi/mvsas/mv_94xx.c 595:command active EEFFFFEF,  slot [18].
> [  360.351415] drivers/scsi/mvsas/mv_sas.c 1904:port 4 slot 10 rx_desc 3000A has error info0000000001000000.
> [  360.351418] drivers/scsi/mvsas/mv_94xx.c 595:command active FFFFFBEF,  slot [a].
> [  360.352397] drivers/scsi/mvsas/mv_sas.c 1904:port 4 slot 27 rx_desc 3001B has error info0000000001000000.
> [  360.352399] drivers/scsi/mvsas/mv_94xx.c 595:command active F7FFDFEF,  slot [1b].
 > ...
> [  366.357261] sas: command 0xe745e480, task 0xe0876500, timed out: BLK_EH_NOT_HANDLED
> [  366.357264] sas: command 0xe6f3b180, task 0xe0877a40, timed out: BLK_EH_NOT_HANDLED
> [  366.357267] sas: command 0xe1234c00, task 0xe08768c0, timed out: BLK_EH_NOT_HANDLED
 > ...
> [  366.357295] sas: Enter sas_scsi_recover_host
> [  366.357297] sas: trying to find task 0xe0876500
> [  366.357298] sas: sas_scsi_find_task: aborting task 0xe0876500
> [  366.357301] drivers/scsi/mvsas/mv_sas.c 1678:mvs_abort_task() mvi=e9ac0000 task=e0876500 slot=e9ad78cc slot_idx=x7
> [  366.357303] sas: sas_scsi_find_task: task 0xe0876500 is aborted
> [  366.357305] sas: sas_eh_handle_sas_errors: task 0xe0876500 is aborted
 > ...
> [  366.357395] ata15: sas eh calling libata cmd error handler
> [  366.357399] ata1: sas eh calling libata port error handler
> [  366.357405] ata2: sas eh calling libata port error handler

This seems like an fairly specific, configuration dependent problem.

I assume this is supported (x8 controller in x4 PCI-slot) and should
"just work", but I don't have any confirmation one way or the other.

Can any one tell me if this should or shouldn't work?  Any suggestions
for a fix?  I am willing to test patches or step through code to debug
this if someone can give me a pointer to get started.

Thanks,
John.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ