lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 24 Nov 2007 23:59:19 +0100
From:	Laurent Riffard <laurent.riffard@...e.fr>
To:	James Bottomley <James.Bottomley@...elEye.com>
CC:	Hannes Reinecke <hare@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-ide@...r.kernel.org,
	linux-scsi@...r.kernel.org
Subject: Re: 2.6.24-rc3-mm1: I/O error, system hangs

Le 24.11.2007 14:26, James Bottomley a écrit :
> On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
>> Le 24.11.2007 07:42, James Bottomley a écrit :
>>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>>>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
[snip]
>>>> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>>>> does fix the problem.
>>>>
>>>>>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error where
>>>>>> I shouldn't. Checking ...
>>>>>>
>>>>> Ok, found it. We are blocking even special commands (ie requests with PREEMPT not set)
>>>>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
>>>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.
>>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
>>> is the state that the domain validation uses and which we cannot kill
>>> fastfail on).  It's definitely wrong to kill fastfail requests when the
>>> state is QUIESCE.
>>>
>>> This patch (which is applied on top of Hannes original) separates the
>>> BLOCK and QUIESCE states correctly ... does this fix the problem?
>>
>> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
> 
> OK, could you post dmesgs again, please.  I actually tested this with an
> aic79xx card, and for me it does cause Domain Validation to succeed
> again.

James, 

Here is a dmesg produced by 2.6.24-rc3-mm1 + your patch "separates the 
BLOCK and QUIESCE states correctly" (http://lkml.org/lkml/2007/11/24/8).

How to reproduce :
- boot
- switch to a text console
- capture dmesg in a file, sync, etc. There are 3 I/O errors, but the 
  system does work.
- switch to X console, log in the Gnome Desktop, the system partially 
  hangs.
- switch back to a text console: dmesg(1) still works, it shows some 
  additonal I/O errors. At this point, any disk access makes the system 
  completely hung.

Additionnal data:
- the I/O errors always happen on the same blocks.

-- 
laurent

View attachment "dmesg-2.6.24-rc3-mm1-patched" of type "text/plain" (30536 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ