[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <21807.35967.660396.209954@mariner.uk.xensource.com>
Date: Thu, 16 Apr 2015 11:18:39 +0100
From: Ian Jackson <Ian.Jackson@...citrix.com>
To: Prashant <prashant@...adcom.com>
CC: Michael Chan <mchan@...adcom.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
"David Vrabel" <david.vrabel@...rix.com>,
Thadeu Lima de Souza Cascardo <cascardo@...ux.vnet.ibm.com>,
Vlad Yasevich <vyasevich@...il.com>,
<xen-devel@...ts.xensource.com>, <netdev@...r.kernel.org>,
"Siva Reddy (Siva) Kallam" <siva.kallam@...adcom.com>,
Sanjeev Bansal <sanjeevb@...adcom.com>
Subject: Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
Prashant writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]"):
> Ian, using your config we are able to recreate the problem that you are
> seeing. The driver finds the RX data buffer to be all zero, with a
> analyzer trace we are seeing the chip is DMA'ing valid RX data buffer
> contents to the host but once the driver tries to read this DMA area, it
> is seeing all zero's which is the reason of the corruption. This is only
> for the RX data buffer, the RX descriptor and status block update DMA
> regions are having valid contents.
I am no expert on this area, but this suggests that the driver is
misoperating the Linux DMA management API. This is what I think
Konrad suspected when he suggested the `iommu=soft swiotlb=force'
command line option.
Note in kernel-parameters.txt:
swiotlb= [ARM,IA-64,PPC,MIPS,X86]
Format: { <int> | force }
<int> -- Number of I/O TLB slabs
force -- force using of bounce buffers even if they
wouldn't be automatically used by the kernel
So with `swiotlb=force' the DMA is _expected_ to go to a bounce buffer
managed by the kernel DMA API.
> This is unlikely to be a chip or driver issue, as the chip is doing the
> correct DMA but the corruption occurs before driver reads it. Would
> request iommu experts to take a look and suggest what can be done next.
As I say above I think this is probably a driver bug.
I have seen identical symptoms on a >5yo desktop box under my desk and
on two brand new rackmount servers; I therefore doubt that it's a
hardware problem.
Ian.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists