lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120620094844.GL2624@amd.com>
Date:	Wed, 20 Jun 2012 11:48:44 +0200
From:	Joerg Roedel <joerg.roedel@....com>
To:	Alexander Duyck <alexander.h.duyck@...el.com>
CC:	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	Jesse Brandeburg <jesse.brandeburg@...el.com>,
	Bruce Allan <bruce.w.allan@...el.com>,
	Carolyn Wyborny <carolyn.wyborny@...el.com>,
	Don Skidmore <donald.c.skidmore@...el.com>,
	Greg Rose <gregory.v.rose@...el.com>,
	Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@...el.com>,
	John Ronciak <john.ronciak@...el.com>,
	<e1000-devel@...ts.sourceforge.net>, <linux-kernel@...r.kernel.org>
Subject: Re: IO_PAGE_FAULTS with igb or igbvf on AMD IOMMU system

Hi Alexander,

On Tue, Jun 19, 2012 at 11:19:20AM -0700, Alexander Duyck wrote:
> Based on the faults it would look like accessing the descriptor rings is
> probably triggering the errors.  We allocate the descriptor rings using
> dma_alloc_coherent so the rings should be mapped correctly.

Can this happen before the driver actually allocated the descriptors? As
I said, the faults appear before any DMA-API call was made for that
device (hence, domain=0x0000, because the domain is assigned on the
first call to the DMA-API for a device).

Also, I don't see the faults every time. One out of ten times
(estimated) there are not faults. Is it possible that this is a race
condition, e.g. that the card trys to access its descriptor rings before
the driver allocated them (or something like that).

> The PF and VF will end up being locked out since they are hung on an
> uncompleted DMA transaction.  Normally we recommend that PCIe Advanced
> Error Reporting be enabled if an IOMMU is enabled so the device can be
> reset after triggering a page fault event.
> 
> The first thing that pops into my head for possible issues would be that
> maybe the VF pci_dev structure or the device structure isn't being
> correctly initialized when SR-IOV is enabled on the igb interface.  Do
> you know if there are any AMD IOMMU specific values on those structures,
> such as the domain, that are supposed to be initialized prior to calling
> the DMA API calls?  If so, have you tried adding debug output to verify
> if those values are initialized on a VF prior to bringing up a VF interface?

Well, when the device appears in the system the IOMMU driver gets
notified about it using the device_change notifiers. It will then
allocate all necessary data structures. I also verified that this works
correctly while debugging this issue. So I am pretty sure the problem
isn't there :)

> Also have you tried any other SR-IOV capable devices on this system? 
> That would be a valuable data point because we could then exclude the
> SR-IOV code as being a possible cause for the issues if other SR-IOV
> devices are working without any issues.

I have another SR-IOV device, but that fails to even enable SR-IOV
because the BIOS did not let enough MMIO resources left. So I couldn't
try it with that device. With the 82576 card enabling SR-IOV works fine
but results in the faults from the VF.

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ