[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080421195650.GA4761@martell.zuzino.mipt.ru>
Date: Mon, 21 Apr 2008 23:56:50 +0400
From: Alexey Dobriyan <adobriyan@...il.com>
To: Chris Snook <csnook@...hat.com>
Cc: Jay Cliburn <jacliburn@...lsouth.net>,
Luca Tettamanti <kronos.it@...il.com>,
Jeff Garzik <jeff@...zik.org>,
Pekka Enberg <penberg@...helsinki.fi>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
Christoph Lameter <clameter@....com>, torvalds@...l.org
Subject: Re: atl1 64-bit => 32-bit DMA borkage (reproducible, bisected)
On Mon, Apr 21, 2008 at 02:42:42PM -0400, Chris Snook wrote:
> Alexey Dobriyan wrote:
>> On Sun, Apr 20, 2008 at 01:37:04PM -0500, Jay Cliburn wrote:
>>> On Sun, 20 Apr 2008 16:26:31 +0400
>>> Alexey Dobriyan <adobriyan@...il.com> wrote:
>>>
>>>> On Sun, Apr 20, 2008 at 06:06:07AM -0500, Jay Cliburn wrote:
>>>>> On Sun, 20 Apr 2008 15:14:53 +0400
>>>>> Alexey Dobriyan <adobriyan@...il.com> wrote:
>>>>>
>>>>>> On Sat, Apr 19, 2008 at 09:54:44PM -0500, Jay Cliburn wrote:
>>>>>>> On Sat, 19 Apr 2008 18:45:35 +0400
>>>>>>> Alexey Dobriyan <adobriyan@...il.com> wrote:
>>> [...]
>>>>>>>> So, it's enough to scp 200 MB git archive and immediately
>>>>>>>> start rebooting sequence for horrors described above to
>>>>>>>> appear. It's not 100% reproducible but more like 90%.
>>>>>>> Do I understand correctly that these failures occur only while
>>>>>>> the network interface is going down?
>>>>>> Yep. During up or running there were no problems with this card.
>>>>>>
>>>>> One more question: Does it happen whether or not you're using atl1
>>>>> as a netconsole?
>>>> Without netconsole bugs happens too.
>>>>
>>> I can't duplicate this error, but it's probably because my machine
>>> doesn't have 4GB of memory.
>>>
>>> I have one report in Febroary 2008 of another user encountering strange
>>> oopses in 2.6.23.12 and 2.6.24 whenever he downed the interface. I
>>> suspect your experience is a repeat of that.
>>>
>>> Just to be clear, you transfer about 200MB to the NIC (Rx direction),
>>> then immediately reboot, right?
>> Yup!
>>> Can you duplicate the problem if you
>>> simply ifconfig down instead of rebooting after the transfer?
>> Aha, ifconfig down is enough. Here is how reproducer looks like now:
>> ./sync-linux-linus && ssh core2 "sudo /sbin/ifconfig eth0 down"
>> where first script is basically scp(1).
>> Also, booting with 1G or 2G of RAM (mem=1024m) makes issue go away.
>> printk at dev_close() time shows that NETIF_F_HIGHDMA was not somehow
>> enabled.
>
> Does the problem go away with iommu=nomerge? If so, I suspect we're not
> properly flushing an iowrite somewhere.
nomerge doesn't help.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists