[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150416122420.GA4051@oc0812247204.ltc.br.ibm.com>
Date: Thu, 16 Apr 2015 09:24:21 -0300
From: cascardo@...ux.vnet.ibm.com
To: Ian Jackson <Ian.Jackson@...citrix.com>
Cc: Prashant <prashant@...adcom.com>,
Michael Chan <mchan@...adcom.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
David Vrabel <david.vrabel@...rix.com>,
Vlad Yasevich <vyasevich@...il.com>,
xen-devel@...ts.xensource.com, netdev@...r.kernel.org,
"Siva Reddy (Siva) Kallam" <siva.kallam@...adcom.com>,
Sanjeev Bansal <sanjeevb@...adcom.com>
Subject: Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
On Thu, Apr 16, 2015 at 11:18:39AM +0100, Ian Jackson wrote:
> Prashant writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]"):
> > Ian, using your config we are able to recreate the problem that you are
> > seeing. The driver finds the RX data buffer to be all zero, with a
> > analyzer trace we are seeing the chip is DMA'ing valid RX data buffer
> > contents to the host but once the driver tries to read this DMA area, it
> > is seeing all zero's which is the reason of the corruption. This is only
> > for the RX data buffer, the RX descriptor and status block update DMA
> > regions are having valid contents.
>
> I am no expert on this area, but this suggests that the driver is
> misoperating the Linux DMA management API. This is what I think
> Konrad suspected when he suggested the `iommu=soft swiotlb=force'
> command line option.
>
> Note in kernel-parameters.txt:
>
> swiotlb= [ARM,IA-64,PPC,MIPS,X86]
> Format: { <int> | force }
> <int> -- Number of I/O TLB slabs
> force -- force using of bounce buffers even if they
> wouldn't be automatically used by the kernel
>
> So with `swiotlb=force' the DMA is _expected_ to go to a bounce buffer
> managed by the kernel DMA API.
>
> > This is unlikely to be a chip or driver issue, as the chip is doing the
> > correct DMA but the corruption occurs before driver reads it. Would
> > request iommu experts to take a look and suggest what can be done next.
>
> As I say above I think this is probably a driver bug.
>
Yes, this looks like the driver is not syncing the DMA buffers. Unmap is
supposed to synchronize as well.
Prashant, can you point to where in the code you see all zeroes after
checking up the data?
Cascardo.
> I have seen identical symptoms on a >5yo desktop box under my desk and
> on two brand new rackmount servers; I therefore doubt that it's a
> hardware problem.
>
> Ian.
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists