lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Fri, 25 Jan 2008 12:37:05 -0500
From:	"Brett Dikeman" <brett.dikeman@...il.com>
To:	linux-poweredge@...ts.us.dell.com, linux-kernel@...r.kernel.org
Subject: Re: 2850 drive bays not hot-swap?

On Jan 24, 2008 3:35 PM, William Warren
<hescominsoon@...anuelcomputerconsulting.com> wrote:

> http://lists.us.dell.com/pipermail/linux-poweredge/2003-July/008898.html

An update: I don't know whether it is a kernel problem or a hardware
problem, but the instructions  pertaining to removing and adding SCSI
devices live above doesn't work on a PE2850 (which uses "LSI Logic /
Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 08)",
aka Dell's PERC controller 4e/Di, driven by the mpt* drivers.)

  It *appears* to work, but the bus seems to grind to a near halt.
This may be an inaccurate conclusion, but I'm leaning toward kernel
issues, since iowait didn't skyrocket.

ID's 2-5 on a U320 bus off a PERC controller are in a RAID10 array.
IDs 0&1 are in a separate mirror.

1)echoed "scsi remove-single-device 0 0 5 0" > /proc/scsi/scsi

2)Pull the drive and issued mdadm /dev/md0 --remove detached

4)Re-inserted the same drive and issued echo "scsi add-single-device 0
0 5 0" > /proc/scsi/scsi
(and the kernel gave it /dev/sdg instead of /dev/sdf, as a side-note.)

5)issued mdadm /dev/md0 --re-add /dev/sdg1

No messages indicating anything afoul.

I then fired up watch -n 5 /proc/mdstat and noticed that the rebuild
rate was a fraction of what it should have been; hovering at a very
constant number- I believe 2339KB/sec.  It should be 74,000KB/sec
(drops to as low as 45,000KB/sec at the tail end of the drives.)  Load
average jumped to 2+, yet nothing showed up in top as taking any CPU
time, even with a 10 second average.  No iowait indicated.  I
increased sync_speed_min to 20000 with no change.

Then I noticed that the system drive mirror (first two drives) looked
to have quite a bit of disk activity, but I couldn't find what was
causing it.  Now I realize that it wasn't "quite a bit"- I think it
was disk I/O grinding to a near halt.

I shut down the system, and after power-up and booting, the RAID10
array is rebuilding at full speed...

I'm happy to provide additional info and try things for interested
parties to troubleshoot or fix this problem.  Being able to hot-swap
drives is pretty critical here.

Brett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ