lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 21 Apr 2008 14:42:42 -0400
From:	Chris Snook <csnook@...hat.com>
To:	Alexey Dobriyan <adobriyan@...il.com>
CC:	Jay Cliburn <jacliburn@...lsouth.net>,
	Luca Tettamanti <kronos.it@...il.com>,
	Jeff Garzik <jeff@...zik.org>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	Christoph Lameter <clameter@....com>, torvalds@...l.org
Subject: Re: atl1 64-bit => 32-bit DMA borkage (reproducible, bisected)

Alexey Dobriyan wrote:
> On Sun, Apr 20, 2008 at 01:37:04PM -0500, Jay Cliburn wrote:
>> On Sun, 20 Apr 2008 16:26:31 +0400
>> Alexey Dobriyan <adobriyan@...il.com> wrote:
>>
>>> On Sun, Apr 20, 2008 at 06:06:07AM -0500, Jay Cliburn wrote:
>>>> On Sun, 20 Apr 2008 15:14:53 +0400
>>>> Alexey Dobriyan <adobriyan@...il.com> wrote:
>>>>
>>>>> On Sat, Apr 19, 2008 at 09:54:44PM -0500, Jay Cliburn wrote:
>>>>>> On Sat, 19 Apr 2008 18:45:35 +0400
>>>>>> Alexey Dobriyan <adobriyan@...il.com> wrote:
>> [...]
>>>>>>> So, it's enough to scp 200 MB git archive and immediately
>>>>>>> start rebooting sequence for horrors described above to
>>>>>>> appear. It's not 100% reproducible but more like 90%.
>>>>>> Do I understand correctly that these failures occur only while
>>>>>> the network interface is going down?
>>>>> Yep. During up or running there were no problems with this card.
>>>>>
>>>> One more question:  Does it happen whether or not you're using atl1
>>>> as a netconsole?
>>> Without netconsole bugs happens too.
>>>
>> I can't duplicate this error, but it's probably because my machine
>> doesn't have 4GB of memory.
>>
>> I have one report in Febroary 2008 of another user encountering strange
>> oopses in 2.6.23.12 and 2.6.24 whenever he downed the interface.  I
>> suspect your experience is a repeat of that.
>>
>> Just to be clear, you transfer about 200MB to the NIC (Rx direction),
>> then immediately reboot, right?
> 
> Yup!
> 
>> Can you duplicate the problem if you
>> simply ifconfig down instead of rebooting after the transfer?  
> 
> Aha, ifconfig down is enough. Here is how reproducer looks like now:
> 
> 	./sync-linux-linus && ssh core2 "sudo /sbin/ifconfig eth0 down"
> 
> where first script is basically scp(1).
> 
> Also, booting with 1G or 2G of RAM (mem=1024m) makes issue go away.
> 
> printk at dev_close() time shows that NETIF_F_HIGHDMA was not somehow
> enabled.
> 

Does the problem go away with iommu=nomerge?  If so, I suspect we're not 
properly flushing an iowrite somewhere.

-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ