linux-kernel - Re: [bug] ata subsystem related crash with latest -git

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <47170B83.7090802@garzik.org>
Date:	Thu, 18 Oct 2007 03:30:11 -0400
From:	Jeff Garzik <jeff@...zik.org>
To:	Jens Axboe <jens.axboe@...cle.com>
CC:	Mark Lord <lkml@....ca>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	David Miller <davem@...emloft.net>,
	fujita.tomonori@....ntt.co.jp, mingo@...e.hu,
	linux-kernel@...r.kernel.org, alan@...rguk.ukuu.org.uk,
	tomof@....org
Subject: Re: [bug] ata subsystem related crash with latest -git

Jens Axboe wrote:
> On Thu, Oct 18 2007, Jeff Garzik wrote:
>> Mark Lord wrote:
>>> Mark Lord wrote:
>>>> Linus Torvalds wrote:
>>>>> On Wed, 17 Oct 2007, Mark Lord wrote:
>>>>>> It would be good to have something soon-ish.
>>>>>> This "dead at boot time" issue is impacting the general ability to test
>>>>>> patches against latest -git in time for the current merge window.
>>>>> In the meantime, does the patch I sent out help people?
>>>> Your patch from this posting http://lkml.org/lkml/2007/10/17/285
>>>> does not seem to make much difference here.
>>>>
>>>> It still crashes at exactly the same place.
>>> However, Jens's patch from that same thread:
>>>     http://lkml.org/lkml/2007/10/17/269
>>> ..allowed me to boot and post this followup message from -git12
>>> Jeff: try that one.
>> That's already in my upstream kernel, here.  commits
>> ba951841ceb7fa5b06ad48caa5270cc2ae17941e and 
>> a3bec5c5aea0da263111c4d8f8eabc1f8560d7bf.
>>
>> sata_mv and sata_nv still reliably poop themselves here, whereas its rock 
>> solid with 2.6.23.1.  Sounds like different issues from yours, as I see a 
>> stream of SATA errors on the bad kernels, errors which are often a symptom 
>> of something whacked in the DMA engine (misprogramming causes the silicon 
>> to generate bogus FIS's, which the device then chokes on)
> 
> Do you know if this poop involves the segment padding that sometimes
> goes on in libata?

Definitely not, in this case -- it's all ATA, nothing ATAPI.

It throws SError { Handshk } which then triggers the EH to reset the 
link, and so it goes, over and over :)  The same thing happens when I 
intentionally screw up the PRD tables.  Not much more data points than 
that, so far.

I'll try the SCSI_MAX_SG_SEGMENTS patch too, to see if that fixes things.

	Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/