[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100127224654.76db3693@lxorguk.ukuu.org.uk>
Date: Wed, 27 Jan 2010 22:46:54 +0000
From: Alan Cox <alan@...rguk.ukuu.org.uk>
To: James Bottomley <James.Bottomley@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-scsi <linux-scsi@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PATCH] SCSI bug fixes for 2.6.33-rc5
On Wed, 27 Jan 2010 16:33:29 -0600
James Bottomley <James.Bottomley@...e.de> wrote:
> On Wed, 2010-01-27 at 22:24 +0000, Alan Cox wrote:
> > > Penchala Narasimha Reddy Chilakala, ERS-HCLTech (1):
> > > aacraid: fix File System going into read-only mode
> >
> > If aacraid is actually getting patches then see
> > also http://bugzilla.kernel.org/show_bug.cgi?id=11120 which I found
> > bugzilla tidyying.
> >
> > Contains a patch and test confirmations
>
> So the patch it contains is almost certainly wrong in general; Mark was
> just suggesting it as a trial ... it might work for specific adapter
> versions but reducing the queue depth by half globally will impact
> performance noticeably. The bug report does rather sound like cabling
> issues are leading to a firmware related problem.
Odd then that they worked reliably until the numbers were increased.
Sorry but having worked on the aacraid for a long time in the past I
don't buy that explanation. Cabling issues would get logged by the driver
and the controller. Secondly I don't buy it because the reporter was
Matthias Ulrichs, who to borrow a hitchhikers term "really knows where his
towel is".
The patch isn't a halving the queue size - its a returning to the known
working state from a regression (unfixed).
The story is pretty simple
Worked until the kernel changed
Didn't work with kernel change
Worked after the kernel changed back.
Kernel's dont go in and fix your cables (much as I wish they did) and
there are two folks who've actually found the bug report specifically
confirming it.
When you have a cable fault on the aacraid you can get hangs on crappier
firmware sets (normally in the BIOS boot though) but it's not dependant
on queue size - it either works or it doesn't. On good firmware you get
nice logged errors and it recovers if possible (or multipaths if you've
got the right bits).
Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists