[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47170B83.7090802@garzik.org>
Date: Thu, 18 Oct 2007 03:30:11 -0400
From: Jeff Garzik <jeff@...zik.org>
To: Jens Axboe <jens.axboe@...cle.com>
CC: Mark Lord <lkml@....ca>,
Linus Torvalds <torvalds@...ux-foundation.org>,
David Miller <davem@...emloft.net>,
fujita.tomonori@....ntt.co.jp, mingo@...e.hu,
linux-kernel@...r.kernel.org, alan@...rguk.ukuu.org.uk,
tomof@....org
Subject: Re: [bug] ata subsystem related crash with latest -git
Jens Axboe wrote:
> On Thu, Oct 18 2007, Jeff Garzik wrote:
>> Mark Lord wrote:
>>> Mark Lord wrote:
>>>> Linus Torvalds wrote:
>>>>> On Wed, 17 Oct 2007, Mark Lord wrote:
>>>>>> It would be good to have something soon-ish.
>>>>>> This "dead at boot time" issue is impacting the general ability to test
>>>>>> patches against latest -git in time for the current merge window.
>>>>> In the meantime, does the patch I sent out help people?
>>>> Your patch from this posting http://lkml.org/lkml/2007/10/17/285
>>>> does not seem to make much difference here.
>>>>
>>>> It still crashes at exactly the same place.
>>> However, Jens's patch from that same thread:
>>> http://lkml.org/lkml/2007/10/17/269
>>> ..allowed me to boot and post this followup message from -git12
>>> Jeff: try that one.
>> That's already in my upstream kernel, here. commits
>> ba951841ceb7fa5b06ad48caa5270cc2ae17941e and
>> a3bec5c5aea0da263111c4d8f8eabc1f8560d7bf.
>>
>> sata_mv and sata_nv still reliably poop themselves here, whereas its rock
>> solid with 2.6.23.1. Sounds like different issues from yours, as I see a
>> stream of SATA errors on the bad kernels, errors which are often a symptom
>> of something whacked in the DMA engine (misprogramming causes the silicon
>> to generate bogus FIS's, which the device then chokes on)
>
> Do you know if this poop involves the segment padding that sometimes
> goes on in libata?
Definitely not, in this case -- it's all ATA, nothing ATAPI.
It throws SError { Handshk } which then triggers the EH to reset the
link, and so it goes, over and over :) The same thing happens when I
intentionally screw up the PRD tables. Not much more data points than
that, so far.
I'll try the SCSI_MAX_SG_SEGMENTS patch too, to see if that fixes things.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists