[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <0631C836DBF79F42B5A60C8C8D4E82296A722A@NAMAIL2.ad.lsil.com>
Date: Wed, 6 Sep 2006 11:14:14 -0600
From: "Patro, Sumant" <Sumant.Patro@...l.com>
To: "Brett G. Durrett" <brett@...u.com>,
"Dave Lloyd" <dlloyd@...gy.com>
Cc: "Bagalkote, Sreenivas" <Sreenivas.Bagalkote@...enio.com>,
"lkml" <linux-kernel@...r.kernel.org>,
"Berkley Shands" <bshands@...gy.com>,
"Kolli, Neela" <Neela.Kolli@...enio.com>,
"Yang, Bo" <Bo.Yang@...enio.com>
Subject: RE: megaraid_sas waiting for command and then offline
Hello Brett,
A DMA related bug was fixed in FW ver *.0095 that was causing
the FW to stop responding.
Please upgrade the FW version to >= *.0095 and let me know if
you still see the issue.
Regards,
Sumant
-----Original Message-----
From: Brett G. Durrett [mailto:brett@...u.com]
Sent: Wednesday, September 06, 2006 9:04 AM
To: Dave Lloyd
Cc: Patro, Sumant; Bagalkote, Sreenivas; lkml; Berkley Shands
Subject: Re: megaraid_sas waiting for command and then offline
The machines are Dell 2900s, so the mobo is custom. From a Dell SE,
"Dell uses a custom mobo that is Dell branded with the Intel chipset
Greencreek.".
B-
Dave Lloyd wrote:
> Brett G. Durrett wrote:
> >
> > I have the same or a similar issue running 2.6.17 SMP x86_64 - the
> > megaraid_sas driver hangs waiting for commands and then the
filesystem
> > unmounts, leaving the machine in an unusable state until there is a
> hard
> > reboot (the machine is responsive but any access, shell or
> otherwise, is
> > impossible without the filesystem). While I do not have much
debugging
> > information available, this happens to me about once every 6-7 days
in
> > my pool of seven machines, so I can probably get debugging info.
Since
> > the disk is offline and I can't get remote console, I don't have any
> > details except something similar to Dave Lloyd's post, below.
> >
> > The only thing that the machines with these failures seem to have in
> > common is the fact that they are almost exclusively writes - they
are
> > slave database machines with large memory and pretty much just
> > replicate. The read/write machines seem to have less failures.
> >
> > I am happy to help provide debugging information in any reasonable
way.
> > In the mean time, if there are any known suggestions or workarounds
for
> > the problem, I would be grateful for the guidance.
> >
> > Here are what details on the controller. If you want additional
info,
> > let me know exactly what you need and I will do what I can to get it
to
> > you.:
> >
> > Product Name : PERC 5/i Integrated
> > Serial No : 12345
> > FW Package Build: 5.0.1-0030
> > FW Version : 1.00.01-0088
> > BIOS Version : MT23
> > Ctrl-R Version :1.02-007
> >
> > B-
>
> Which motherboard are you using? We believe that this may be a
> motherboard specific issue. It appears to happen on a SuperMicro
> motherboard but not a Tyan motherboard.
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists