lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 10 Sep 2013 18:04:01 +0000
From:	"Rich, Jason" <jason.rich@...comms.com>
To:	Willy Tarreau <w@....eu>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: Panic at _blk_run_queue on 2.6.32

> -----Original Message-----
> From: linux-kernel-owner@...r.kernel.org [mailto:linux-kernel-
> owner@...r.kernel.org] On Behalf Of Willy Tarreau
> Sent: Wednesday, July 10, 2013 3:27 PM
> To: Rich, Jason
> Cc: linux-kernel@...r.kernel.org
> Subject: Re: Panic at _blk_run_queue on 2.6.32
> 
> Hi Jason,
> 
> On Tue, Jul 09, 2013 at 05:42:29PM +0000, Rich, Jason wrote:
> > Greetings,
> > I've recently encountered an issue where multiple hosts are failing to
> > boot up about 1/5 of the time.  So far I have confirmed this issue on three
> seperate host machines.  The issue presents itself after updating 2.6.32.39 to
> patch 50 and patch 61.
> > Both patch levels result in the failure described below.  Since this occurs on
> multiple hosts, I feel I can safely rule out hardware.
> 
> First, thank you for your very detailed report. Do you think you could narrow
> this down to a specific kernel version ? Given that there are exactly 10
> versions between .39 and .50, I think that a version-level bisect would take
> 3 or 4 builds (so probably around 20 reboots).
> 
> It would help us spot the faulty patch. Right now, there are 546 patches
> between .39 and .50 so it's quite hard to find the culprit, even with your full
> trace. That does not mean we'll immediately spot it, maybe a deeper bisect
> will be needed, but it should be easier.
> 
> > It is also of note that I have not seen this behavior on the 3.4.26 kernel, or
> on any of my 32bit hosts.
> 
> This is a good news, because we're probably missing one fix from a more
> recent version that addressed a similar regression and that we might
> backport into 2.6.32.62.
> 
> > That said, I have to support this software release (which runs on the 2.6
> kernel) for at least another two years.
> 
> Be careful on this point, 2.6.32 is planned for EOL next year :
> 
>    https://www.kernel.org/category/releases.html
> 
> You might want to consider migrating to a supported distro kernel or to 3.2
> instead. That said, if you follow carefully the updates from later kernels, you
> might prefer to maintain your own backports of the patches that are relevant
> to your usage.
> 
> Best regards,
> Willy
> 

Greeting Willy,
You helped me out with this particular issue about 2 months ago.  What we found is that my particular panic appears to be addressed by a specific commit you referred me to:
b485462 [SCSI] Stop accepting SCSI requests before removing a device

Without going into too much detail, I'm not able to jump directly to that hash because I have about 7 different drivers failing to compile due to other changes between 2.6.32.61 and that hash.  In particular, some header files were renamed, others deleted and replaced by newer features.  To go through and update my proprietary drivers is as big of a headache as just getting this scsi panic fixed on top of patch 61.

I've spent the last couple of weeks playing with getting the scsi fix applied on top of patch 61 and am having a very difficult time.  There are so many dependencies from prior commits to the scsi code it is making it quite difficult to determine what exactly I need.  

I'm hoping you might be able to help me out with some advice or perhaps you are familiar enough with the scsi code as to help me apply the concept of the fix to the top of patch 61.  I have attached the patch I've come up with so far, but this is obviously missing other dependencies as I keep ending up with panics.  It goes without saying that this code is very foreign to me and I don't completely understand what it is doing.

I know your time is valuable so I've attached the patch I've been working on so far, however, this code causes its own kernel panic and should not be run on a live system.  That said, perhaps it will give you a baseline as to what I'm trying to do.  Again, this patch is based off on the official 2.6.32.61 tag.

Thanks for any help,
Jason Rich

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the
> body of a message to majordomo@...r.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


Download attachment "0001-scsi_panic.patch" of type "application/octet-stream" (9576 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ