lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 27 Jan 2010 17:09:16 -0600
From:	James Bottomley <James.Bottomley@...e.de>
To:	Alan Cox <alan@...rguk.ukuu.org.uk>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-scsi <linux-scsi@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PATCH] SCSI bug fixes for 2.6.33-rc5

On Wed, 2010-01-27 at 22:46 +0000, Alan Cox wrote:
> On Wed, 27 Jan 2010 16:33:29 -0600
> James Bottomley <James.Bottomley@...e.de> wrote:
> 
> > On Wed, 2010-01-27 at 22:24 +0000, Alan Cox wrote:
> > > > Penchala Narasimha Reddy Chilakala, ERS-HCLTech (1):
> > > >       aacraid: fix File System going into read-only mode
> > > 
> > > If aacraid is actually getting patches then see
> > > also http://bugzilla.kernel.org/show_bug.cgi?id=11120 which I found
> > > bugzilla tidyying.
> > > 
> > > Contains a patch and test confirmations
> > 
> > So the patch it contains is almost certainly wrong in general; Mark was
> > just suggesting it as a trial ... it might work for specific adapter
> > versions but reducing the queue depth by half globally will impact
> > performance noticeably.  The bug report does rather sound like cabling
> > issues are leading to a firmware related problem.
> 
> Odd then that they worked reliably until the numbers were increased.
> Sorry but having worked on the aacraid for a long time in the past I
> don't buy that explanation. Cabling issues would get logged by the driver
> and the controller. Secondly I don't buy it because the reporter was
> Matthias Ulrichs, who to borrow a hitchhikers term "really knows where his
> towel is". 
> 
> The patch isn't a halving the queue size - its a returning to the known
> working state from a regression (unfixed).

What regression?  The 32 bit queue depth has always been 256 since 2005
(when it was reduced from 512) ... it's never been 127.

> The story is pretty simple
> 
> Worked until the kernel changed
> Didn't work with kernel change
> Worked after the kernel changed back.
> 
> Kernel's dont go in and fix your cables (much as I wish they did) and
> there are two folks who've actually found the bug report specifically
> confirming it.

But we have two bug reports for all of the aacraids over the last five
years ... the patch would reduce the maximum transfer length from 128k
to 63.5k.

Linux tends to send down the largest transfer size it can, suggesting
that most of the aacraids in the field are happy with 128k.

The maximum transfer length critically impacts I/O throughput and
performance ... I can't just penalise everyone for the sake of two bug
reports.

This value can already be altered on the fly using the

/sys/block/<dev>/queue/max_sectors_kb

Setting that should work for the two reporters without impacting anyone
else.

> When you have a cable fault on the aacraid you can get hangs on crappier
> firmware sets (normally in the BIOS boot though) but it's not dependant
> on queue size - it either works or it doesn't. On good firmware you get
> nice logged errors and it recovers if possible (or multipaths if you've
> got the right bits).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ